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TITLE 

METHOD FOR THE PRODUCTION OF GLYCEROL 
BY RECOMBINANT ORGANISMS 
FIELD OF INVENTION 
5 The present invention relates to the field of molecular biology and the 

use of recombinant organisms for the production of glycerol and compounds 
derived from the glycerol biosynthetic pathway. More specifically the invention 
describes the construction of a recombinant cell for the production of glycerol 
and derived compounds from a carbon substrate, the cell containing foreign 
10 genes encoding proteins having glycerol-3-phosphate dehydrogenase (G3PDH) 
and glycerol-3-phosphatase (G3P phosphatase) activities where the endogenous 
genes encoding the glycerol-converting glycerol kinase and glycerol 
dehydrogenase activities have been deleted. 

BACKGROUND 

15 Glycerol is a compound in great demand by industry for use in 

cosmetics, liquid soaps, food, pharmaceuticals, lubricants, anti-fteeze solutions, 
and in numerous other applications. The esters of glycerol are important in the 
fat and oil industry. Historically, glycerol has been isolated from animal fat and 
similar sources; however, the process is laborious and inefficient. Microbial 

20 production of glycerol is preferred. 

Not all organisms have a natural capacity to synthesize glycerol. 
However, the biological production of glycerol is known for some species of 
bacteria, algae, and yeast. The bacteria Bacillus licheniformis and Lactobacillus 
lycopersica synthesize glycerol. Glycerol production is found in the halotolerant 

25 algae Dunaliella sp. and Asteromonas gracilis for protection against high 
external salt concentrations (Ben-Amotz et al., (1982) Experientia 38:49-52). 
Similarly, various osmotolerant yeast synthesize glycerol as a protective 
measure. Most strains of Saccharomyces produce some glycerol during 
alcoholic fermentation and this production can be increased by the application of 

30 osmotic stress (Albertyn et al., (1994) Moi Cell Biol. 14, 4135-4144). Earlier 
this century glycerol was produced commercially with Saccharomyces cultures 
to which steering reagents were added such as sulfites or alkalis. Through the 
formation of an inactive complex, the steering agents block or inhibit the 
conversion of acetaldehyde to ethanol; thus, excess reducing equivalents 

35 (NADH) are available to or "steered" towards dihydroxyacetone phosphate 

(DHAP) for reduction to produce glycerol. This method is limited by the partial 
inhibition of yeast growth that is due to the sulfites. This limitation can be 
partially overcome by the use of alkalis which create excess NADH equivalents 
by a different mechanism. In this practice, the alkalis initiated a Cannizarro 
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disproportionation to yield ethanol and acetic acid from two equivalents of 
acetaldehyde. Thus, although production of glycerol is possible from naturally 
occurring organisms, production is often subject to the need to control osmotic 
stress of the cultures and the production of sulfites. A method free from these 
5 limitations is desirable. Production of glycerol from recombinant organisms 
containing foreign genes encoding key steps in the glycerol biosynthetic pathway 
is one possible route to such a method. 

A number of the genes involved in the glycerol biosynthetic pathway 
have been isolated. For example, the gene encoding glycerol-3-phosphate 

10 dehydrogenase (DAR1, GPD1) has been cloned and sequenced from 

Saccharomyces diastaticus (Wang et al., (1994), 7. Bact. 176:7091-7095). The 
DAR1 gene was cloned into a shuttle vector and used to transform E. coli where 
expression produced active enzyme. Wang et al., supra, recognizes that DAR1 
is regulated by the cellular osmotic environment but does not suggest how the 

15 gene might be used to enhance glycerol production in a recombinant organism. 

Other glycerol-3-phosphate dehydrogenase enzymes have been isolated. 
For example, sn-glycerol-3-phosphate dehydrogenase has been cloned and 
sequenced from S. cerevisiae (Larasonet al,, (1993) MoL Microbiol., 10:1101). 
Albertyn et al., (1994) MoL Cell. Biol., 14:4135) teach the cloning of GPD1 

20 encoding a glycerol-3-phosphate dehydrogenase from S. cerevisiae. Like Wang 
et al., both Albertyn et al. and Larason et al. recognize the osmo-sensitvity of 
the regulation of this gene but do not suggest how the gene might be used in the 
production of glycerol in a recombinant organism. 



25 Saccharomyces cerevisiae and the protein identified as being encoded by the 
GPP1 and GPP2 genes (Norbeck et al., (1996) J. Biol. Chem., 271:13875). 
Like the genes encoding G3PDH, it appears that GPP2 is osmotically-induced. 

Although the genes encoding G3PDH and G3P phosphatase have been 
isolated, there is no teaching in the art that demonstrates glycerol production 

30 from recombinant organisms with G3PDH/G3P phosphatase expressed together 
or separately. Further, there is no teaching to suggest that efficient glycerol 
production from any wild-type organism is possible using these two enzyme 
activities that does not require applying some stress (salt or an osmolyte) to the 
cell. In fact, the art suggests that G3PDH activities may not affect glycerol 

35 production. For example, Eustace ((1987), Can. J. Microbiol., 33:112-117)) 
teaches hybridized yeast strains that produced glycerol at greater levels than the 
parent strains. However, Eustace also demonstrates that G3PDH activity 
remained constant or slightly lower in the hybridized strains as opposed to the 
wild type. 



As with G3PDH, glycerol-3 -phosphatase has been isolated from 
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Glycerol is an industrially useful material. However, other compounds 
may be derived from the glycerol biosynthetic pathway that also have 
commercial significance. For example, glycerol-producing organisms may be 
engineered to produce 1,3 -propanediol (U.S. 5686276), a monomer having 
5 potential utility in the production of polyester fibers and the manufacture of 
polyurethanes and cyclic compounds. It is known for example that in some 
organisms, glycerol is converted to 3-hydroxypropionaldehyde and then to 
1,3-propanediol through the actions of a dehydratase enzyme and an 
oxidoreductase enzyme, respectively. Bacterial strains able to produce 
10 1,3-propanediol have been found, for example, in the groups Citrobaaer, 
Clostridium, Enterobacter, Ilyobacter, Klebsiella, Lactobacillus, and 
Pelobacter. Glycerol dehydratase and diol dehydratase systems are described by 
Seyfried et al. (1996) /. Bacteriol. 178:5793-5796 and Tobimatsu et al. (1995) 
J, Biol. Chem. 270:7142-7148, respectively. Recombinant organisms, 
15 containing exogenous dehydratase enzyme, that are able to produce 
1,3-propanediol have been described (U.S. 5686276). Although these 
organisms produce 1,3-propanediol, it is clear that they would benefit from a 
system that would minimize glycerol conversion. 

There are a number of advantages in engineering a glycerol-producing 
20 organism for the production of 1,3-propanediol where conversion of glycerol is 
minimized. A microorganism capable of efficiently producing glycerol under 
physiological conditions is industrially desirable, especially when the glycerol 
itself will be used as a substrate in vivo as part of a more complex catabolic or 
biosynthetic pathway that could be perturbed by osmotic stress or the addition of 
25 steering agents (e.g., the production of 1,3-propanediol). Some attempts at 
creating glycerol kinase and glycerol dehydrogenase mutants have been made. 
For example, De Koning et al. (1990) AppL Microbiol Biotechnol. 32:693-698 
report the methanol-dependent production of dihydroxyacetone and glycerol by 
mutants of the methylotrophic yeast Hansenula potymorpha blocked in 
30 dihydroxyacetone kinase and glycerol kinase. Methanol and an additional 
substrate, required to replenish the xyulose-5-phosphate co-substrate of the 
assimilation reaction, were used to produce glycerol; however, a 
dihydroxyacetone reductase (glycerol dehydrogenase) is also required. 
Similarly, Shaw and Cameron, Book of Abstracts, 211th ACS National Meeting, 
35 New Orleans, LA, March 24-28 (1996), BIOT-154 Publisher: American 
Chemical Society, Washington, D. C, investigate the deletion of of IdhA 
(lactate dehydrogenase), glpK (glycerol kinase), and tpiA (triosephosphate 
isomerase) for the optimization of 1,3-propanediol production. They do not 
suggest the expression of cloned genes for G3PDH or G3P phosphatase for the 
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production of glycerol or 1,3-propanediol and they do not discuss the impact of 
glycerol dehydrogenase. 

The problem to be solved, therefore, is the lack of a process to direct 
carbon flux towards glycerol production by the addition or enhancement of 
5 certain enzyme activities, especially G3PDH and G3P phosphatase which 

respectively catalyze the conversion of dihydroxyacetone phosphate (DHAP) to 
glycerol-3-phosphate (G3P) and then to glycerol. The problem is complicated 
by the need to control the carbon flux away from glycerol by deletion or 
decrease of certain enzyme activities, especially glycerol kinase and glycerol 
10 dehydrogenase which respectively catalyze the conversion of glycerol plus ATP 
to G3P and glycerol to dihydroxyacetone (or glyceraldehyde). 

SUMMARY OF THE INVENTION 
The present invention provides a method for the production of glycerol 
from a recombinant organism comprising: transforming a suitable host cell with 
15 an expression cassette comprising either one or both of (a) a gene encoding a 
protein having glycerol-3-phosphate dehydrogenase activity and (b) a gene 
encoding a protein having glycerol-3 -phosphate phosphatase activity, where the 
suitable host cell contains a disruption in either one or both of (a) a gene 
encoding an endogenous glycerol kinase and (b) a gene encoding an endogenous 
20 glycerol dehydrogenase, wherein the disruption prevents the expression of active 
gene product; culturing the transformed host cell in the presence of at least one 
carbon source selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and single-carbon substrates, whereby 
glycerol is produced; and recovering the glycerol produced. 
25 The present invention further provides a process for the production of 

1,3-propanediol from a recombinant organism comprising: transforming a 
suitable host cell with an expression cassette comprising either one or both of 
(a) a gene encoding a protein having glycerol-3-phosphate dehydrogenase 
activity and (b) a gene encoding a protein having glycerol-3-phosphate 
30 phosphatase activity, the suitable host cell having at least one gene encoding a 
protein having a dehydratase activity and having a disruption in either one or 
both of (a) a gene encoding an endogenous glycerol kinase and (b) a gene 
encoding an endogenous glycerol dehydrogenase, wherein the disruption in the 
genes of (a) or (b) prevents the expression of active gene product; culturing the 
35 transformed host cell in the presence of at least one carbon source selected from 
the group consisting of monosaccharides, oligosaccharides, polysaccharides, and 
single-carbon substrates whereby 1,3-propanediol is produced; and recovering 
the 1,3-propanediol produced. 
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Additionally, the invention provides for a process for the production of 
1 ,3-propanedioI from a recombinant organism where multiple copies of 
endogeneous genes are introduced. 

Further embodiments of the invention include host cells transformed with 
heterologous genes for the glycerol pathway as well as host cells which contain 
endogeneous genes for the glycerol pathway. 

Additionally, the invention provides recombinant cells suitable for the 
production either glycerol or 1,3-propanediol, the host cells having genes 
expressing either one or both of a glycerol-3-phosphate dehydrogenase activity 
and a glycerol-3-phosphate phosphatase activity wherein the cell also has 
disruptions in either one or both of a gene encoding an endogenous glycerol 
kinase and a gene encoding an endogenous glycerol dehydrogenase, wherein the 
disruption in the genes prevents the expression of active gene product. 

BRIEF DESCRIPTION OF THE FIGURES. BIOLOGICAL 
DEPOSITS AND SEQUENCE LISTING 

Figure 1 illustrates the representative enzymatic pathways involving 
glycerol metabolism. 

Applicants have made the following biological deposits under the terms 
of the Budapest Treaty on the International Recognition of the Deposit of 
Micro-organisms for the Purposes of Patent Procedure: 



Depositor Identification 
Reference 



Int'l. Depository 
Designation Date of Deposit 



25 



30 



Escherichia coli pAH21/DH5a 
(containing the GPP2 gene) 

Escherichia coli (pDARlA/AA200) 
(containing the DAR1 gene) 

FMS Escherichia coli RJFlOm 
(containing a glpK disruption) 

FMS Escherichia coli MSP33.6 
(containing a gldA disruption) 



ATCC 98187 26 September 1996 

ATCC 98248 6 November 1996 

ATCC 98597 25 November 1997 

ATCC 98598 25 November 1997 



**ATCC n refers to the American Type Culture Collection international 
depository located at 12301 Parklawn Drive, Rockville, MD 20852 U.S.A. The 
designation is the accession number of the deposited material. 

Applicants have provided 43 sequences in conformity with the Rules for 
the Standard Representation of Nucleotide and Amino Acid Sequences in Patent 
Applications (Annexes I and II to the Decision of the President of the EPO, 
published in Supplement No. 2 to OJ EPO, 12/1992) and with 37 C.F.R. 
1.821-1.825 and Appendices A and B (Requirements for Application Disclosures 
Containing Nucleotides and/or Amino Acid Sequences). 
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DETAILED DESCRIPTION OF THE INVENTION 
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The present invention solves the problem stated above by providing a 
method for the biological production of glycerol from a fermentable carbon 
source in a recombinant organism. The method provides a rapid, inexpensive 
and environmentally-responsible source of glycerol useful in the cosmetics and 
pharmaceutical industries. The method uses a microorganism containing cloned 
homologous or heterologous genes encoding glycerol-3-phosphate 
dehydrogenase (G3PDH) and/or glycerol-3-phosphatase (G3P phosphatase). 
These genes are expressed in a recombinant host having disruptions in genes 
encoding endogenous glycerol kinase and/or glycerol dehydrogenase enzymes. 
The method is useful for the production of glycerol, as well as any end products 
for which glycerol is an intermediate. The recombinant microorganism is 
contacted with a carbon source and cultured and then glycerol or any end 
products derived therefrom are isolated from the conditioned media. The genes 
may be incorporated into the host microorganism separately or together for the 
production of glycerol. 

Applicants 1 process has not previously been described for a recombinant 
organism and required the isolation of genes encoding the two enzymes and their 
subsequent expression in a host cell having disruptions in the endogenous kinase 
and dehydrogenase genes. It will be appreciated by those familiar with this an 
that Applicants' process may be generally applied to the production compounds 
where glycerol is a key intermediate, e.g., 1,3-propanedioL 

As used herein the following terms may be used for interpretation of the 
claims and specification. 

The terms "glycerol-3 -phosphate dehydrogenase" and "G3PDH" refer to 
a polypeptide responsible for an enzyme activity that catalyzes the conversion of 
dihydroxyacetone phosphate (DHAP) to glycerol-3-phosphate (G3P). In vivo 
G3PDH may be NADH; NADPH; or FAD-dependent. The NADH-dependent 
enzyme (EC 1.1.1,8) is encoded, for example, by several genes including GPD1 
(GenBank Z74071x2), or GPD2 (GenBank Z35169xl), or GPD3 (GenBank 
G984182), or DAR1 (GenBank Z74071x2). The NADPH-dependent enzyme 
(EC 1.1.1.94) is encoded by gpsA (GenBank U321643, (cds 197911-196892) 
G466746 and L45246). The FAD-dependent enzyme (EC 1.1.99.5) is encoded 
by GUT2 (GenBank Z47047x23), or glpD (GenBank G147838), or glpABC 
(GenBank M20938). 

The terms w glyceroI-3-phosphatase tt , tt sn-glycerol-3-phosphatase rt , or 
"d,l-glycerol phosphatase", and "G3P phosphatase" refer to a polypeptide 
responsible for an enzyme activity that catalyzes the conversion of glycero- 
phosphate and water to glycerol and inorganic phosphate. G3P phosphatase is 
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encoded, for example, by GPP1 (GenBank Z47047xl25), or GPP2 (GenBank 
U18813xll). 

The term "glycerol kinase" refers to a polypeptide responsible for an 
enzyme activity that catalyzes the conversion of glycerol and ATP to glycerol-3- 
5 phosphate and ADP. The high energy phosphate donor ATP may be replaced 
by physiological substitutes (e.g. phosphoenolpyruvate). Glycerol kinase is 
encoded, for example, by GUT1 (GenBank Ul 1583x19) and glpK (GenBank 
L19201). 

The term "glycerol dehydrogenase" refers to a polypeptide responsible 

10 for an enzyme activity that catalyzes the conversion of glycerol to 

dihydroxyacetone (E.C. 1.1.1.6) or glycerol to glyceraldehyde (E.C. 1.1.1.72). 
A polypeptide responsible for an enzyme activity that catalyzes the conversion of 
glycerol to dihydroxyacetone is also referred to as a "dihydroxyacetone 
reductase". Glycerol dehydrogenase may be dependent upon NADH 

15 (E.C. 1.1.1.6), NADPH (E.C. 1.1.1.72), or other cofactors (e.g., 

E.C. 1.1.99.22). A NADH-dependent glycerol dehydrogenase is encoded, for 
example, by gldA (GenBank U00006). 

The term "dehydratase enzyme" will refer to any enzyme that is capable 
of isomerizing or converting a glycerol molecule to the product 

20 3-hydroxypropion-aIdehyde. For the purposes of the present invention the 

dehydratase enzymes include a glycerol dehydratase (E.C. 4.2.1.30) and a diol 
dehydratase (E.C. 4.2.1.28) having preferred substrates of glycerol and 
1,2-propanediol, respectively. In Citrobacterfreundii, for example, glycerol 
dehydratase is encoded by three polypeptides whose gene sequences are 

25 represented by dhaB y dhaC and dhaE (GenBank U0977 1 : base pairs 

8556-10223, 10235-10819, and 10822-11250, respectively). In Klebsiella 
oxytoca, for example, diol dehydratase is encoded by three polypeptides whose 
gene sequences are represented by pddA, pddB, and pddC (GenBank D45071: 
base pairs 121-1785, 1796-2470, and 2485-3006, respectively). 

30 The terms "GPDl", "DARl", "OSGr, "D2830", and "YDL022W n 

will be used interchangeably and refer to a gene that encodes a cytosolic 
glycerol-3-phosphate dehydrogenase and is characterized by the base sequence 
given as SEQ ID NO:l. 

The term "GPD2" refers to a gene that encodes a cytosolic glycerol-3- 

35 phosphate dehydrogenase and is characterized by the base sequence given in 
SEQ ID NO:2. 

The terms "GUT2" and "YIL155C" are used interchangeably and refer 
to a gene that encodes a mitochondrial glycerol-3-phosphate dehydrogenase and 
is characterized by the base sequence given in SEQ ID NO: 3. 
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The terms "GPP1", "RHR2" and "YIL053W" are used interchangeably 
and refer to a gene that encodes a cytosolic glyceroI-3-phosphatase and is 
characterized by the base sequence given in SEQ ID NO:4. 

The terms "GPP2'\ "HOR2" and "YER062C" are used interchangeably 
5 and refer to a gene that encodes a cytosolic glycerol-3-phosphatase and is 
characterized by the base sequence given as SEQ ID NO:5. 

The term "GUT1" refers to a gene that encodes a cytosolic glycerol 
kinase and is characterized by the base sequence given as SEQ ID NO:6. The 
term "glpK" refers to another gene that encodes a glycerol kinase and is 
10 characterized by the base sequence given in GeneBank LI 9201 , base pairs 
77347-78855. 

The term "gldA" refers to a gene that encodes a glycerol dehydrogenase 
and is characterized by the base sequence given in GeneBank U00006, base 
pairs 3174-4316. The term "dhaD" refers to another gene that encodes a 
15 glycerol dehydrogenase and is characterized by the base sequence given in 
GeneBank U09771, base pairs 2557-3654. 

As used herein, the terms "function" and "enzyme function" refer to the 
catalytic activity of an enzyme in altering the energy required to perform a 
specific chemical reaction. Such an activity may apply to a reaction in 
20 equilibrium where the production of both product and substrate may be 
accomplished under suitable conditions. 

The terms "polypeptide" and "protein" are used interchangeably. 

The terms "carbon substrate" and "carbon source" refer to a carbon 
source capable of being metabolized by host organisms of the present invention 
25 and particularly mean carbon sources selected from the group consisting of 

monosaccharides, oligosaccharides, polysaccharides, and one-carbon substrates 
or mixtures thereof. 

"Conversion" refers to the metabolic processes of an organism or cell 
that by means of a chemical reaction degrades or alters the complexity of a 
30 chemical compound or substrate. 

The terms "host cell" and "host organism" refer to a microorganism 
capable of receiving foreign or heterologous genes and additional copies of 
endogeneous genes and expressing those genes to produce an active gene 
product. 

35 The terms "production cell" and "production organism" refer to a cell 

engineered for the production of glycerol or compounds that may be derived 
from the glycerol biosynthetic pathway. The production cell will be 
recombinant and contain either one or both of a gene that encodes a protein 
having a glycerol-3-phosphate dehydrogenase activity and a gene encoding a 

8 
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protein having a glycerol-3 -phosphatase activity. In addition to the G3PDH and 
G3P phosphatase genes, the host cell will contain disruptions in one or both of a 
gene encoding an endogenous glycerol kinase and a gene encoding an 
endogenous glycerol dehydrogenase. Where the production cell is designed to 
5 produce 1,3-propanediol, it will additionally contain a gene encoding a protein 
having a dehydratase activity. 

The terms "foreign gene", "foreign DNA", "heterologous gene", and 
"heterologous DNA" all refer to genetic material native to one organism that has 
been placed within a different host organism. 
10 The term "endogenous" as used herein with reference to genes or 

polypeptides expressed by genes, refers to genes or polypeptides that are native 
to a production cell and are not derived from another organism. Thus an 
"endogenous glycerol kinase" and an "endogenous glycerol dehydrogenase" are 
terms referring to polypeptides encoded by genes native to the production cell. 
15 The terms "recombinant organism" and "transformed host" refer to any 

organism transformed with heterologous or foreign genes. The recombinant 
organisms of the present invention express foreign genes encoding G3PDH and 
G3P phosphatase for the production of glycerol from suitable carbon substrates. 
Additionally, the terms "recombinant organism" and "transformed host" refer to 
20 any organism transformed with endogenous (or homologous) genes so as to 
increase the copy number of the genes. 

"Gene" refers to a nucleic acid fragment that expresses a specific 
protein, including regulatory sequences preceding (5 f non-coding) and following 
(3* non-coding) the coding region. The terms "native" and "wild-type" gene 
25 refer to the gene as found in nature with its own regulatory sequences/' 

The terms "encoding" and "coding" refer to the process by which a 
gene, through the mechanisms of transcription and translation, produces an 
amino acid sequence. The process of encoding a specific amino acid sequence is 
meant to include DNA sequences that may involve base changes that do not 
30 cause a change in the encoded amino acid, or which involve base changes which 
may alter one or more amino acids, but do not affect the functional properties of 
the protein encoded by the DNA sequence. Therefore, the invention 
encompasses more than the specific exemplary sequences. Modifications to the 
sequence, such as deletions, insertions, or substitutions in the sequence which 
35 produce silent changes that do not substantially affect the functional properties 
of the resulting protein molecule are also contemplated. For example, 
alterations in the gene sequence which reflect the degeneracy of the genetic 
code, or which result in the production of a chemically equivalent amino acid at 
a given site, are contemplated; thus, a codon for the amino acid alanine, a 
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hydrophobic amino acid, may be substituted by a codon encoding another less 
hydrophobic residue, such as glycine, or a more hydrophobic residue, such as 
valine, leucine, or isoleucine. Similarly, changes which result in substitution of 
one negatively charged residue for another, such as aspartic acid for glutamic 
5 acid, or one positively charged residue for another, such as lysine for arginine, 
can also be expected to produce a biologically equivalent product. Nucleotide 
changes which result in alteration of the N-terminal and C-terminal portions of 
the protein molecule would also not be expected to alter the activity of the 
protein. In some cases, it may in fact be desirable to make mutants of the 

10 sequence in order to study the effect of alteration on the biological activity of the 
protein. Each of the proposed modifications is well within the routine skill in 
the art, as is determination of retention of biological activity in the encoded 
products. Moreover, the skilled artisan recognizes that sequences encompassed 
by this invention are also defined by their ability to hybridize, under stringent 

15 conditions (0.1X SSC, 0.1% SDS, 65 °C), with the sequences exemplified 
herein. 

The term "expression" refers to the transcription and translation to gene 
product from a gene coding for the sequence of the gene product. 

The terms "plasmid", "vector", and "cassette" as used herein refer to an 

20 extra chromosomal element often carrying genes which are not part of the 
central metabolism of the cell and usually in the form of circular double- 
stranded DNA molecules. Such elements may be autonomously replicating 
sequences, genome integrating sequences, phage or nucleotide sequences, linear 
or circular, of a single- or double-stranded DNA or RNA, derived from any 

25 source, in which a number of nucleotide sequences have been joined or 
recombined into a unique construction which is capable of introducing a 
promoter fragment and DNA sequence for a selected gene product along with 
appropriate 3' untranslated sequence into a cell. "Transformation cassette" 
refers to a specific vector containing a foreign gene and having elements in 

30 addition to the foreign gene that facilitate transformation of a particular host 
cell. "Expression cassette" refers to a specific vector containing a foreign gene 
and having elements in addition to the foreign gene that allow for enhanced 
expression of that gene in a foreign host. 

The terms "transformation" and "transfection" refer to the acquisition of 

35 new genes in a cell after the incorporation of nucleic acid. The acquired genes 
may be integrated into chromosomal DNA or introduced as extrachromosomal 
replicating sequences. The term "transformant" refers to the cell resulting from 
a transformation. 
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The term "genetically altered" refers to the process of changing 
hereditary material by transformation or mutation. The terms "disruption" and 
"gene interrupt " as applied to genes refer to a method of genetically altering an 
organism by adding to or deleting from a gene a significant portion of that gene 
5 such that the protein encoded by that gene is either not expressed or not 
expressed in active form. 
Glycerol Biosynthetic Pathway 

It is contemplated that glycerol may be produced in recombinant 
organisms by the manipulation of the glycerol biosynthetic pathway found in 
most microorganisms. Typically, a carbon substrate such as glucose is 
converted to glucose-6-phosphate via hexokinase in the presence of ATP. 
Glucose-phosphate isomerase catalyzes the conversion of glucose-6-phosphate to 
fructose-6-phosphate and then to fructose- 1,6-diphosphate through the action of 
6-phosphofructokinase. The diphosphate is then taken to dihydroxy acetone 
15 phosphate (DHAP) via aldolase. Finally NADH-dependent G3PDH converts 
DHAP to glycerol-3-phosphate which is then dephosphorylated to glycerol by 
G3P phosphatase. (Agarwal (1990), Adv. Biochem. Engrg. 41:114). 
Genes encoding G3PDH. glycerol dehydrogenase. G3P phosp hatase anrf 
glycerol kinase 

20 The present invention provides genes suitable for the expression of 

G3PDH and G3P phosphatase activities in a host cell. 

Genes encoding G3PDH are known. For example, GPD1 has been 
isolated from Saccharomyces and has the base sequence given by SEQ ID NO:l, 
encoding the amino acid sequence given in SEQ ID NO: 7 (Wang et ai., supra). 

25 Similarly, G3PDH activity has also been isolated from Saccharomyces encoded 
by GPD2 having the base sequence given in SEQ ID NO:2 encoding the amino 
acid sequence given in SEQ ID NO:8 (Eriksson et al., (1995) MoL Microbiol, 
17:95). 

For the purposes of the present invention it is contemplated that any gene 
30 encoding a polypeptide responsible for G3PDH activity is suitable wherein that 
activity is capable of catalyzing the conversion of dihydroxyacetone phosphate 
(DHAP) to glycerol-3-phosphate (G3P). Further, it is contemplated that any 
gene encoding the amino acid sequence of G3PDH as given by SEQ ID NOS:7, 
8, 9, 10, 11 and 12 corresponding to the genes GPD1, GPD2, GUT2, gpsA, 
35 glpD, and the a subunit of glpABC respectively, will be functional in the 

present invention wherein that amino acid sequence may encompass amino acid 
substitutions, deletions or additions that do not alter the function of the enzyme. 
The skilled person will appreciate that genes encoding G3PDH isolated from 
other sources will also be suitable for use in the present invention. For 
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example, genes isolated from prokaryotes include GenBank accessions M34393, 
M20938, L06231, U12567, L45246, L45323, L45324, L45325, U32164, 
U32689, and U39682. Genes isolated from fungi include GenBank accessions 
U30625, U30876 and X56162; genes isolated from insects include GenBank 
5 accessions X61223 and X14179; and genes isolated from mammalian sources 
include GenBank accessions U 12424, M25558 and X78593. 

Genes encoding G3P phosphatase are known. For example, GPP2 has 
been isolated from Saccharomyces cerevisiae and has the base sequence given by 
SEQ ID NO:5, which encodes the amino acid sequence given in SEQ ID NO: 13 
10 (Norbeck et al. f (1996), J. Biol. Chem., 271:13875). 

For the purposes of the present invention, any gene encoding a G3P 
phosphatase activity is suitable for use in the method wherein that activity is 
capable of catalyzing the conversion of glyceroI-3-phosphateand water to 
glycerol and inorganic phosphate. Further, any gene encoding the amino acid 
15 sequence of G3P phosphatase as given by SEQ ID NOS:13 and 14 

corresponding to the genes GPP2 and GPP1 respectively, will be functional in 
the present invention including any amino acid sequence that encompasses amino 
acid substitutions, deletions or additions that do not alter the function of the G3P 
phosphatase enzyme. The skilled person will appreciate that genes encoding 
20 G3P phosphatase isolated from other sources will also be suitable for use in the 
present invention. For example, the dephosphorylation of glycerol-3-phosphate 
to yield glycerol may be achieved with one or more of the following general or 
specific phosphatases: alkaline phosphatase (EC 3.1.3.1) [GenBank M19159, 
M29663, U02550 or M33965]; acid phosphatase (EC 3.1.3.2) [GenBank 
25 U51210, U19789, U28658 or L20566]; glycerol-3-phosphatase (EC 3.1.3.-) 
[GenBank Z38060 or U18813xll]; glucose- 1 -phosphatase (EC 3.1.3.10) 
[GenBank M33807]; glucose-6-phosphatase (EC 3.1.3.9) [GenBank U00445]; 
fructose-l,6-bisphosphatase (EC 3.1.3.11) [GenBank X12545 or J03207] or 
phosphotidyl glycero phosphate phosphatase (EC 3.1.3.27) [GenBank M23546 
30 andM23628]. 

Genes encoding glycerol kinase are known. For example, GUT1 
encoding the glycerol kinase from Saccharomyces has been isolated and 
sequenced (Pavlik et al. (1993), Curr. Genet., 24:21) and the base sequence is 
given by SEQ ID NO:6, which encodes the amino acid sequence given in 
35 SEQ ID NO: 15. Alternatively, glpK encodes a glycerol kinase from E. coli and 
is characterized by the base sequence given in GeneBank L19201, base pairs 
77347-78855. 

Genes encoding glycerol dehydrogenase are known. For example, gldA 
encodes a glycerol dehydrogenase from£. coli and is characterized by the base 
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sequence given in GeneBank U00006, base pairs 3174-4316. Alternatively, 
dhaD refers to another gene that encodes a glycerol dehydrogenase from 
Citrobacter freundii and is characterized by the base sequence given in 
GeneBank U09771, base pairs 2557-3654. 
5 Host cells 

Suitable host cells for the recombinant production of glycerol by the 
expression of G3PDH and G3P phosphatase may be either prokaryotic or 
eukaryotic and will be limited only by their ability to express active enzymes. 
Preferred host cells will be those bacteria, yeasts, and filamentous fungi 
10 typically useful for the production of glycerol such as Citrobacter, Enterobacter, 
Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, Saccharomyces, 
Schizosaccharomyces f Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, 
Salmonella, Bacillus, Streptomyces and Pseudomonas. Preferred in the present 
15 invention are E. coli and Saccharomyces. 

Where glycerol is a key intermediate in the production of 1,3-propane- 
diol the host cell will either have an endogenous gene encoding a protein having 
a dehydratase activity or will acquire such a gene through transformation. Host 
cells particularly suited for production of l t 3-propanediol are Citrobacter, 
20 Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, and 

Salmonella, which have endogenous genes encoding dehydratase enzymes. 
Additionally, host cells that lack such an endogeneous gene include E. coli. 
Vectors And Expression Cassettes 

The present invention provides a variety of vectors and transformation 
25 and expression cassettes suitable for the cloning, transformation and expression 
of G3PDH and G3P phosphatase into a suitable host cell. Suitable vectors will 
be those which are compatible with the bacterium employed. Suitable vectors 
can be derived, for example, from a bacteria, a virus (such as bacteriophage T7 
or a M-13 derived phage), a cosmid, a yeast or a plant. Protocols for obtaining 
30 and using such vectors are known to those in the art (Sambrook et al. , Molecular 
Cloning: A Laboratory Manual - volumes 1, 2, 3 (Cold Spring Harbor 
Laboratory: Cold Spring Harbor, NY, 1989)). 

Typically, the vector or cassette contains sequences directing 
transcription and translation of the appropriate gene, a selectable marker, and 
35 sequences allowing autonomous replication or chromosomal integration. 

Suitable vectors comprise a region 5' of the gene which harbors transcriptional 
initiation controls and a region 3' of the DNA fragment which controls 
transcriptional termination. It is most preferred when both control regions are 
derived from genes homologous to the transformed host cell. Such control 
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regions need not be derived from the genes native to the specific species chosen 
as a production host. 

Initiation control regions, or promoters, which are useful to drive 
expression of the G3PDH and G3P phosphatase genes in the desired host cell are 
5 numerous and familiar to those skilled in the art. Virtually any promoter 

capable of driving these genes is suitable for the present invention including but 
not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PH05, GAPDH, 
ADC1, TRP1, URA3, LEU2, ENO, and TPI (useful for expression in 
Saccharomyces); AOX1 (useful for expression in Pichia); and lac, trp, X? Lt 
10 \P R , T7, tac, and trc, (useful for expression in E. coll). 

Termination control regions may also be derived from various genes 
native to the preferred hosts. Optionally, a termination site may be unnecessary; 
however, it is most preferred if included. 

For effective expression of the instant enzymes, DNA encoding the 
15 enzymes are linked operably through initiation codons to selected expression 
control regions such that expression results in the formation of the appropriate 
messenger RNA. 

Transformation O f Suitable Hosts And Expression Of G3PDH And G3P 
Phosphatase For The Production Of Glycerol 
20 Once suitable cassettes are constructed they are used to transform 

appropriate host cells. Introduction of the cassette containing the genes 
encoding G3PDH and/or G3P phosphatase into the host cell may be 
accomplished by known procedures such as by transformation, e.g., using 
calcium-permeabilized cells, electroporation, or by transfection using a 
25 recombinant phage virus (Sambrook et al., supra). 

In the present invention AH21 and DAR1 cassettes were used to 
transform the E. coli DH5a and FM5 as fully described in the GENERAL 
METHODS and EXAMPLES. 

Alternatively, it is contemplated that suitable host cells comprising 
30 endogenous G3PDH and/or G3P phosphatase genes may be manipulated so that 
the relevant genes are upregulated for the production of glycerol. 

Methods for upregulation of endogenous genes are well known in the art. 
For example, to upregulate the desired gene(s), a structural gene is generally 
placed downstream from a promoter region on the DNA which is recognized by 
the recipient microorganism. In addition to the promoter, one may include other 
regulatory sequences that increase or control expression from heterologous genes. 
In addition, one may alter the regulatory sequences of endogenous genes by any 
known genetic manipulation for the same purpose. Expression may be controlled 
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by an inducer or a repressor so that the microorganism coordinately expresses the 
gene(s) necessary to complete the desired metabolic pathway. 

In the instant invention host cells containing endogenous genes encoding 
G3PDH and/or G3P phosphatase activities could be placed under the control of 
5 regulated promoters (e.g. lac or osmy) or constitutive promoters. For example, 
a cassette may be constructed to contain a specific inducible or constitutive 
promoter, flanked by DNA of sufficient length and homology to the native gene 
to permit targeting. Introduction of the cassette under suitable growth 
conditions will result in homologous recombination between the cassette and the 
10 targeted portion of the gene and the replacement of the relevant native promoter 
with the regulatable promoter. Such methods may be employed to effect the 
upregulation of endogenous genes encoding G3PDH and/or G3P phosphatase 
activities for the production of glycerol. 

Random And Site Specific Mutagenisis For Disrupting Enzyme Activities : 
15 Enzyme pathways by which organisms metabolize glycerol are known in 

the art, Figure 1. Glycerol is converted to glycerol-3 -phosphate (G3P) by an 
ATP-dependent glycerol kinase; the G3P may then be oxidized to DHAP by 
G3PDH. In a second pathway, glycerol is oxidized to dihydroxyacetone (DHA) 
by a glycerol dehydrogenase; the DHA may then be converted to DHAP by an 
20 ATP-dependent DHA kinase. In a third pathway, glycerol is oxidized to 
glyceraldehyde by a glycerol dehydrogenase; the glyceraldehyde may be 
phosphorylated to glyceraldehyde-3-phosphate by an ATP-dependent kinase. 
DHAP and gIyceraldehyde-3-phosphate, interconverted by the action of 
triosephosphate isomerase, may be further metabolized via central metabolism 
25 pathways. These pathways, by introducing by-products, are deleterious to 
glycerol production. 

One aspect of the present invention is the ability to provide a production 
organism for the production of glycerol where the glycerol -convening activities 
of glycerol kinase and glycerol dehydrogenase have been deleted. Methods of 
30 creating deletion mutants are common and well known in the art. For example, 
wild type cells may be exposed to a variety of agents such as radiation or 
chemical mutagens and then screened for the desired phenotype. When creating 
mutations through radiation either ultraviolet (UV) or ionizing radiation may be 
used. Suitable short wave UV wavelengths for genetic mutations will fall within 
35 the range of 200 nm to 300 nm where 254 nm is preferred. UV radiation in this 
wavelength principally causes changes within nucleic acid sequence from 
guanidine and cytosine to adenine and thymidine. Since all cells have DNA 
repair mechanisms that would repair most UV induced mutations, agents such as 
caffeine and other inhibitors may be added to interrupt the repair process and 
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maximize the number of effective mutations. Long wave UV mutations using 
light in the 300 nm to 400 nm range are also possible but are generally not as 
effective as the short wave UV light unless used in conjunction with various 
activators such as psoralen dyes that interact with the DNA. 
5 Mutagenesis with chemical agents is also effective for generating mutants 

and commonly used substances include chemicals that affect nonreplicating 
DNA such as HN0 2 and NH 2 OH, as well as agents that affect replicating DNA 
such as acridine dyes, notable for causing frameshift mutations. Specific 
methods for creating mutants using radiation or chemical agents are well 
10 documented in the art. See for example Thomas D. Brock in Biotechnology: A 
Textbook of Industrial Microbiology . Second Edition (1989) Sinauer Associates, 
Inc., Sunderland, MA., or Deshpande, Mukund V., Appl. Biochem. 
BiotechnoL, 36, 227, (1992), herein incorporated by reference. 

After mutagenesis has occurred, mutants having the desired phenotype 
15 may be selected by a variety of methods. Random screening is most common 
where the mutagenized cells are selected for the ability to produce the desired 
product or intermediate. Alternatively, selective isolation of mutants can be 
performed by growing a mutagenized population on selective media where only 
resistant colonies can develop. Methods of mutant selection are highly 
20 developed and well known in the art of industrial microbiology. See Brock, 
Supra., DeMancilha et al., Food Chem., 14, 313, (1984). 

Biological mutagenic agents which target genes randomly are well known 
in the art. See for example De Bruijn and Rossbach in Methods for General and 
Molecular Bacteriology (1994) American Society for Microbiology, 
25 Washington, D.C. Alternatively, provided that gene sequence is known, 

chromosomal gene disruption with specific deletion or replacement is achieved 
by homologous recombination with an appropriate plasmid. See for example 
Hamilton et al. (1989) 7. BacterioL 171:4617-4622, Balbas et al. (1993) Gene 
136:211-213, Gueldener et al. (1996) Nucleic Acids Res. 24:2519-2524, and 
30 Smith et al. (1996) Methods Mol Cell BioL 5:270-277. 

It is contemplated that any of the above cited methods may be used for 
the deletion or inactivation of glycerol kinase and glycerol dehydrogenase 
activities in the preferred production organism. 
Media and Carbon Substrates 
35 Fermentation media in the present invention must contain suitable carbon 

substrates. Suitable substrates may include but are not limited to mono- 
saccharides such as glucose and fructose, oligosaccharides such as lactose or 
sucrose, polysaccharides such as starch or cellulose or mixtures thereof and 
unpurified mixtures from renewable feedstocks such as cheese whey permeate, 
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cornsteep liquor, sugar beet molasses, and barley malt. Additionally, the carbon 
substrate may also be one-carbon substrates such as carbon dioxide, or methanol 
for which metabolic conversion into key biochemical intermediates has been 
demonstrated. 

5 Glycerol production from single carbon sources (e.g., methanol, 

formaldehyde or formate) has been reported in methylotrophic yeasts (Yamada 
et al. (1989), Agric. Biol Chem., 53(2):541-543) and in bacteria (Hunter et al. 
(1985), Biochemistry, 24:4148-4155). These organisms can assimilate single 
carbon compounds, ranging in oxidation state from methane to formate, and 
10 produce glycerol. The pathway of carbon assimilation can be through ribulose 
monophosphate, through serine, or through xylulose-monophosphate 
(Gottschalk, Bacterial Metabolism . Second Edition, Springer- Verlag: New 
York (1986)), The ribulose monophosphate pathway involves the condensation 
of formate with ribulose-5 -phosphate to form a 6 carbon sugar that becomes 
15 fructose and eventually the three carbon product, glyceraldehyde-3-phosphate. 
Likewise, the serine pathway assimilates the one-carbon compound into the 
glycolytic pathway via methylenetetrahydrofolate. 

In addition to one and two carbon substrates, methylotrophic organisms 
are also known to utilize a number of other carbon-containing compounds such 
20 as methylamine, glucosamine and a variety of amino acids for metabolic 

activity. For example, methylotrophic yeast are known to utilize the carbon 
from methylamine to form trehalose or glycerol (Bellion et al. (1993), Microb. 
Growth CI Compd., [Int. Symp.], 7th, 415-32. Editor(s): Murrell, J. Collin; 
Kelly, Don P. Publisher: Intercept, Andover, UK). Similarly, various species 
25 of Candida will metabolize alanine or oleic acid (Suiter et al. (1990), Arch. 
Microbiol., 153(5):485-9). Hence, the source of carbon utilized in the present 
invention may encompass a wide variety of carbon-containing substrates and will 
only be limited by the choice of organism. 

Although all of the above mentioned carbon substrates and mixtures 
30 thereof are suitable in the present invention, preferred carbon substrates are 

monosaccharides, oligosaccharides, polysaccharides, single-carbon substrates or 
mixtures thereof. More preferred are sugars such as glucose, fructose, sucrose, 
maltose, lactose and single carbon substrates such as methanol and carbon 
dioxide. Most preferred as a carbon substrate is glucose. 
35 In addition to an appropriate carbon source, fermentation media must 

contain suitable minerals, salts, cofactors, buffers and other components, known 
to those skilled in the art, suitable for the growth of the cultures and promotion 
of the enzymatic pathway necessary for glycerol production. 
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Culture Conditions 

Typically cells are grown at 30 °C in appropriate media. Preferred 
growth media are common commercially prepared media such as Luria Bertani 
(LB) broth, Sabouraud Dextrose (SD) broth, or Yeast medium (YM) broth. 
5 Other defined or synthetic growth media may also be used and the appropriate 
medium for growth of the particular microorganism will be known by one 
skilled in the art of microbiology or fermentation science. The use of agents 
known to modulate catabolite repression directly or indirectly, e.g., cyclic 
adenosine 3': 5' -monophosphate, may also be incorporated into the reaction 
10 media. Similarly, the use of agents known to modulate enzymatic activities 
(e.g., sulfites, bisulfites, and alkalis) that lead to enhancement of glycerol 
production may be used in conjunction with or as an alternative to genetic 
manipulations. 

Suitable pH ranges for the fermentation are between pH 5.0 to pH 9.0 
15 where the range of pH 6.0 to pH 8.0 is preferred for the initial condition. 

Reactions may be performed under aerobic or anaerobic conditions 
where anaerobic or microaerobic conditions are preferred. 
Identification of G3PDH. glycerol dehydrogenase. G3P phosphatase, and 
glycerol kinase activities 
20 The levels of expression of the proteins G3PDH, G3P phosphatase 

glycerol dehydrogenase, and glycerol kinase are measured by enzyme assays. 
Generally, G3PDH activity and glycerol dehydrogenase activity assays rely on 
the spectral properties of the cosubstrate, NADH, in the DHAP conversion to 
G-3-P and the DHA conversion to glycerol, respectively. NADH has intrinsic 
25 UV/vis absorption and its consumption can be monitored spectrophotometrically 
at 340 nm. G3P phosphatase activity can be measured by any method of 
measuring the inorganic phosphate liberated in the reaction. The most 
commonly used detection method uses the visible spectroscopic determination of 
a blue-colored phosphomolybdate ammonium complex. Glycerol kinase activity 
30 can be measured by the detection of G3P from glycerol and ATP, for example, 
by NMR. Assays can be directed toward more specific characteristics of 
individual enzymes if necessary, for example, by the use of alternate cofactors. 
Identification and recovery of glyc erol and other products fe.g. L3-prnp flngHini) 
Glycerol and other products (e.g. 1,3-propanediol) may be identified and 
35 quantified by high performance liquid chromatography (HPLC) and gas 

chromatography/mass spectroscopy (GC/MS) analyses on the cell-free extracts. 
Preferred is a HPLC method where the fermentation media are analyzed on an 
analytical ion exchange column using a mobile phase of 0.01N sulfuric acid in 
an isocratic fashion. 
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Methods for the recovery of glycerol from fermentation media are known 
in the art. For example, glycerol can be obtained from cell media by subjecting 
the reaction mixture to the following sequence of steps: filtration; water 
removal; organic solvent extraction; and fractional distillation (U.S. Patent 
5 No. 2,986,495). 

Description Of The Preferred Embodiments 

Production of Glycerol 

The present invention describes a method for the production of glycerol 
from a suitable carbon source utilizing a recombinant organism. Particularly 
10 suitable in the invention is a bacterial host cell, transformed with an expression 
cassette carrying either or both of a gene that encodes a protein having a 
glycerol-3-phosphate dehydrogenase activity and a gene encoding a protein 
having a glyceroI-3-phosphatase activity. In addition to the G3PDH and G3P 
phosphatase genes, the host cell will contain disruptions in either or both of 

15 genes encoding endogenous glycerol kinase and glycerol dehydrogenase 

enzymes. The combined effect of the foreign G3PDH and G3P phosphatase 
genes (providing a pathway from the carbon source to glycerol) with the gene 
disruptions (blocking the conversion of glycerol) results in an organism that is 
capable of efficient and reliable glycerol production. 

20 Although the optimal organism for glycerol production contains the 

above mentioned gene disruptions, glycerol production is possible with a host 
cell containing either one or both of the foreign G3PDH and G3P phosphatase 
genes in the absence of such disruptions. For example, the recombinant E. coli 
strain AA200 carrying the DAR1 gene (Example 1) was capable of producing 

25 between 0.38 g/L and 0.48 g/L of glycerol depending on fermentation' 

parameters. Similarly, the E. coli DH5cc, carrying and expressible GPP2 gene 
(Example 2), was capable of 0.2 g/L of glycerol production. Where both genes 
are present, (Example 3 and 4), glycerol production attained about 40 g/L. 
Where both genes are present in conjunction with an elimination of the 

30 endogenous glycerol kinase activity, a reduction in the conversion of glycerol 
may be seen (Example 8). Furthermore, the presence of glycerol dehydrogenase 
activity is linked to the conversion of glycerol under glucose-limited conditions; 
thus, it is anticipated that the elimination of glycerol dehydrogenase activity will 
result in the reduction of glycerol conversion (Example 8). 

35 Production of 1.3-prop anediol 

The present invention may also be adapted for the production of 
1,3-propanediol by utilizing recombinant organisms expressing the foreign 
G3PDH and/or G3P phosphatase genes and containing disruptions in the 
endogenous glycerol kinase and/or glycerol dehydrogenase activities. 
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Additionally, the invention provides for the process for the production of 
1,3-propanediol from a recombinant organism where multiple copies of 
endogeneous genes are introduced. In addition to these genetic alterations, the 
production cell will require the presence of a gene encoding an active 
5 dehydratase enzyme. The dehydratase enzyme activity may either be a glycerol 
dehydratase or a diol dehydratase. The dehydratase enzyme activity may result 
from either the expression of an endogenous gene or from the expression of a 
foreign gene transfected into the host organism. Isolation and expression of 
genes encoding suitable dehydratase enzymes are well known in the art and are 

10 taught by applicants in PCT/US96/06705, filed 5 November 1996 and 

U.S. 5686276 and U.S. 5633362, hereby incorporated by reference. It will be 
appreciated that, as glycerol is a key intermediate in the production of 
1,3-propanediol, where the host cell contains a dehydratase activity in 
conjunction with expressed foreign G3PDH and/or G3P phosphatase genes and 

15 in the absence of the glycerol-converting glycerol kinase or glycerol 

dehydrogenase activities, the cell will be particularly suited for the production of 
1,3-propanediol. 

The present invention is further defined in the following Examples. It 
should be understood that these Examples, while indicating preferred 
20 embodiments of the invention, are given by way of illustration only. From the 
above discussion and these Examples, one skilled in the art can ascertain the 
essential characteristics of this invention, and without departing from the spirit 
and scope thereof, can make various changes and modifications of the invention 
to adapt it to various usages and conditions. 



GENERAL METHODS 

Procedures for phosphorylations, ligations, and transformations are well 
known in the art. Techniques suitable for use in the following examples may be 
found in Sambrook et al. f Molecular Cloning: A Laboratory Manual . Second 

30 Edition, Cold Spring Harbor Laboratory Press (1989). 

Materials and methods suitable for the maintenance and growth of 
bacterial cultures are well known in the art. Techniques suitable for use in the 
following examples may be found in Manual of Methods for General 
Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene 

35 W, Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), 
American Society for Microbiology, Washington, DC. (1994) or in 
Biotechnology: A Text book of Industrial Microbiology (Thomas D. Brock, 
Second Edition (1989) Sinauer Associates, Inc., Sunderland, MA). All reagents 
and materials used for the growth and maintenance of bacterial cells were 
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obtained from Aldrich Chemicals (Milwaukee, WI), DIFCO Laboratories 
(Detroit, MI), GIBCO/BRL (Gaithersburg, MD), or Sigma Chemical Company 
(St. Louis, MO) unless otherwise specified. 

The meaning of abbreviations is as follows: w h" means hour(s), M muT 
5 means minute(s), "see" means second(s), u d" means day(s), "mL" means 
milliliters, "L" means liters. 
Cell strains 

The following Escherichia coli strains were used for transformation and 
expression of G3PDH and G3P phosphatase. Strains were obtained from the 
10 E. coli Genetic Stock Center, ATCC, or Life Technologies (Gaithersburg, MD). 

AA200 {garBW fhuA22 ompF627fadL701 relAI pit-10 spoTl tpi-1 phoMSW 
mcrBl) (Anderson et al., (1970), J. Gen. Microbiol., 62:329). 

15 BB20 (tonA22 AphoA8fadL701 relAI glpR2 glpD3 pit-10 gpsA20 spoTl T2R) 
(Cronan et al., J. Bact. y 118:598). 

DH5a (deoR endAl gyrA96 hsdR17 recAl relAI supE44 thi-1 A(lacZYA- 
argFV169) phi80lacZAM15 F) (Woodcock et al., (1989), Nucl. Acids Res. , 
20 17:3469). 

FM5 Escherichia coli (ATCC 5391 1) 

Identification of Glypergl 

25 The conversion of glucose to glycerol was monitored by HPLC and/or 

GC. Analyses were performed using standard techniques and materials available 
to one of skill in the art of chromatography. One suitable method utilized a 
Waters Maxima 820 HPLC system using UV (210 nm) and RI detection. 
Samples were injected onto a Shodex SH-1011 column (8 mm x 300 mm; 

30 Waters, Milford, MA) equipped with a Shodex SH-1011P precolumn (6 mm x 
50 mm), temperature-controlled at 50 °C, using 0.01 N H 2 S0 4 as mobile phase 
at a flow rate of 0.69 mL/min. When quantitative analysis was desired, samples 
were prepared with a known amount of trimethylacetic acid as an external 
standard. Typically, the retention times of 1,3 -propanediol (RI detection), 

35 glycerol (RI detection) and glucose (RI detection) were 21.39 min, 17.03 min 
and 12.66 min, respectively. 

Glycerol was also analyzed by GC/MS. Gas chromatography with mass 
spectrometry detection for separation and quantitation of glycerol was performed 
using a DB-WAX column (30 m, 0.32 mm I.D., 0.25 urn film thickness, J & W 

40 Scientific, Folsom, CA) at the following conditions: injector: split, 1:15; 
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sample volume: 1 uL; temperature profile: 150 °C intitial temperature with 
30 sec hold, 40 °C/min to 180 °C, 20 °C/min to 240 °C, hold for 2.5 min. 
Detection: EI Mass Spectrometry (Hewlett Packard 5971, San Fernando, CA), 
quantitative SIM using ions 61 m/z and 64 m/z as target ions for glycerol and 
5 glycero!-d8, and ion 43 m/z as qualifier ion for glycerol. Glycerol-d8 was used 
as an internal standard. 

Assay for glvcerol-3-phosphatase. G3P phosphatase 

The assay for enzyme activity was performed by incubating the extract 
with an organic phosphate substrate in a bis-Tris or MES and magnesium buffer, 

10 pH 6.5. The substrate used was either l-a-glycerol phosphate, or d,l-a-gIycerol 
phosphate. The final concentrations of the reagents in the assay are: buffer 
(20 mM bis-Tris or 50 mM MES); MgCl 2 (10 mM); and substrate (20 raM). If 
the total protein in the sample was low and no visible precipitation occurs with 
an acid quench, the sample was conveniently assayed in the cuvette. This 

15 method involved incubating an enzyme sample in a cuvette that contained 
20 mM substrate (50 ^L, 200 mM), 50 mM MES, 10 mM MgCl 2> pH 6.5 
buffer. The final phosphatase assay volume was 0.5 mL. The enzyme- 
containing sample was added to the reaction mixture; the contents of the cuvette 
were mixed and then the cuvette was placed in a circulating water bath at 

20 T = 37 °C for 5 to 120 min, the length of time depending on whether the 
phosphatase activity in the enzyme sample ranged from 2 to 0.02 U/mL. The 
enzymatic reaction was quenched by the addition of the acid molybdate reagent 
(0.4 mL). After the Fiske SubbaRow reagent (0.1 mL) and distilled water 
(1.5 mL) were added, the solution was mixed and allowed to develop. After 

25 10 min, to allow full color development, the absorbance of the samples was read 
at 660 nm using a Cary 219 UV/Vis spectrophotometer. The amount of 
inorganic phosphate released was compared to a standard curve that was 
prepared by using a stock inorganic phosphate solution (0.65 mM) and preparing 
6 standards with final inorganic phosphate concentrations ranging from 0.026 to 
30 0.130 nmol/mL. 

Spectrophotometry Assay for Glycerol 3-Phosphate Dehydrogenase fOp DH) 

Activity 

The following procedure was used as modified below from a method 
published by Bell et al. (1975), 7. Bioi Chem., 250:7153-8. This method 
35 involved incubating an enzyme sample in a cuvette that contained 0.2 mM 
NADH; 2.0 mM dihydroxyacetone phosphate (DHAP), and enzyme in 0.1 M 
Tris/HCl, pH 7.5 buffer with 5 mM DTT,in a total volume of 1.0 mL at 30 °C. 
The spectrophotometer was set to monitor absorbance changes at the fixed 
wavelength of 340 nm. The instrument was blanked on a cuvette containing 
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buffer only. After the enzyme was added to the cuvette, an absorbance reading 
was taken. The first substrate, NADH (50 uL4mM NADH; absorbance should 
increase approx 1.25 AU), was added to determine the background rate. The 
rate should be followed for at least 3 min. The second substrate, DHAP (50 uL 
5 40 mM DHAP), was then added and the absorbance change over time was 
monitored for at least 3 min to determine to determine the gross rate. G3PDH 
activity was defined by subtracting the background rate from the gross rate. 
H C-NMR Assay for Glycerol Kinase Activity 

An appropriate amount of enzyme, typically a cell-free crude extract, 

10 was added to a reaction mixture containing 40 mM ATP, 20 mM MgS0 4 , 

21 mM uniformly l3 C labelled glycerol (99%, Cambridge Isotope Laboratories), 
and 0. 1 M Tris-HCl, pH 9 for 75 min at 25 °C. The conversion of glycerol to 
glycerol 3-phosphate was detected by 13 C-NMR (125 MHz): glycerol 
(63.11 ppm, d, / = 41 Hz and 72.66 ppm, t, J = 41 Hz); glycerol 3-phosphate 

15 (62.93 ppm, d, J = 41 Hz; 65.31 ppm, br d, J = 43 Hz; and 72.66 ppm, dt, 
/ = 6, 41 Hz). 

NADH-linked Glycerol Dehydrogenase Assay 

NADH -linked glycerol dehydrogenase activity in £. coli strains (gldA) 
was determined after protein separation by non-denaturing polyacrylamide gel 
20 electrophoresis. The conversion of glycerol plus NAD + to dihydroxyacetone 
plus NADH was coupled with the conversion of 3-[4,5-dimethylthiazol-2-yl]- 
2,5-diphenyltetrazolium bromide (MTT) to a deeply colored formazan, using 
phenazine methosulfate (PMS) as mediator. (Tang et al. (1997) /. Bacteriol. 
140:182). 

25 Electrophoresis was performed in duplicate by standard procedures using 

native gels (8-16% TG, 1.5 mm, 15 lane gels from Novex, San Diego, CA). 
Residual glycerol was removed from the gels by washing 3x with 50 mM Tris or 
potassium carbonate buffer, pH 9 for 10 min. The duplicate gels were 
developed, with and without glycerol (approx. 0.16 M final concentration), in 

30 15 mL of assay solution containing 50 mM Tris or potassium carbonate, pH 9, 
60 mg ammonium sulfate, 75 mg NAD+, 1 .5 mg MTT, and 0.5 mg PMS. 

The presence or absence of NADH -linked glycerol dehydrogenase 
activity in £. coli strains (gldA) was also determined, following polyacrylamide 
gel electrophoresis, by reaction with polyclonal antibodies raised to purified 

35 K. pneumoniae glycerol dehydrogenase (dhaD). 
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PLASMID CONSTRUCTION AND STRAIN CONSTRUCTION 
Cloning and expression of glycerol 3-phosphatase for increase of glycerol 
production in E. coli DH5a and FM5 

The Saccharomyces cerevisiae chromosomeV lamda clone 6592 (Gene 
5 Bank, accession # U18813xll) was obtained from ATCC. The glycerol 
3-phosphate phosphatase (GPP2) gene was cloned by cloning from the lamda 
clone as target DNA using synthetic primers (SEQ ID NO: 16 with 
SEQ ID NO: 17) incorporating an BamHI-RBS-Xbal site at the 5' end and a 
Smal site at the 3' end. The product was subcloned into pCR-Script (Stratagene, 

10 Madison, WI) at the Srfl site to generate the plasmids pAH15 containing GPP2. 
The plasmid pAH15 contains the GPP2 gene in the inactive orientation for 
expression from the lac promoter in pCR-Script SK+ . The BamHI-Smal 
fragment from pAH15 containing the GPP2 gene was inserted into pBlueScriptn 
SK+ to generate plasmid pAH19. The pAH19 contains the GPP2 gene in the 

15 correct orientation for expression from the lac promoter. The Xbal-PstI 

fragment from pAH19 containing the GPP2 gene was inserted into pPHOX2 to 
create plasmid pAH21. The pAH21/ DH5ot is the expression plasmid. 
Plasmids for the over-expression of PARI in E. coli 

DAR1 was isolated by PGR cloning from genomic 5. cerevisiae DNA 

20 using synthetic primers (SEQ ID NO: 18 with SEQ ID NO: 19). Successful PCR 
cloning places an Ncol site at the 5 1 end of DAR1 where the ATG within Ncol 
is the DAR1 initiator methionine. At the 3' end of DAR1 a BamHI site is 
introduced following the translation terminator. The PCR fragments were 
digested with Ncol + BamHI and cloned into the same sites within the 

25 expression plasmid pTrc99A (Pharmacia, Piscataway, NJ) to give pDARIA, 
In order to create a better ribosorae binding site at the 5' end of DAR1, 
an Spel-RBS-Ncol linker obtained by annealing synthetic primers (SEQ ID 
NO:20 with SEQ ED NO:21) was inserted into the Ncol site of pDARIA to 
create pAH40. Plasmid pAH40 contains the new RBS and DAR1 gene in the 

30 correct orientation for expression from the trc promoter of pTrc99A (Pharmacia, 
Piscataway, NJ). The NcoI-BamHI fragement from pDARIA and an second set 
of Spel-RBS-Ncol linker obtained by annealing synthetic primers (SEQ ID 
NO:22 with SEQ ID NO:23) was inserted into the Spel-BamHI site of 
pBC-SK+ (Stratagene, Madison, WI) to create plasmid pAH42. The plasmid 

35 pAH42 contains a chloramphenicol resistant gene. 

Construction of ex pression cassettes for PARI and GPP2 

Expression cassettes for DAR1 and GPP2 were assembled from the 
individual DAR1 and GPP2 subclones described above using standard molecular 
biology methods. The BamHI-Pstl fragment from pAH19 containing the 
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ribosomal binding site (RBS) and GPP2 gene was inserted into pAH40 to create 
pAH43. The BamHI-PstI fragment from pAH19 containing the RBS and GPP2 
gene was inserted into pAH42 to create pAH45. 

The ribosome binding site at the 5' end of GPP2 was modified as 
5 follows. A BamHI-RBS-Spel linker, obtained by annealing synthetic primers 
GATCCAGGAAACAGA (SEQ ID NO:24) with CTAGTCTGTTTCCTG (SEQ 
ID NO:25) to the Xbal-PstI fragment from pAH19 containing the GPP2 gene, 
was inserted into the BamHI-PstI site of pAH40 to create pAH48. Plasmid 
pAH48 contains the DAR1 gene, the modified RBS, and the GPP2 gene in the 
10 correct orientation for expression from the trc promoter of pTrc99A (Pharmacia, 
Piscataway, NJ). 
Transformation of E. coli 

All the plasmids described here were transformed into E. coli DH5a or 
FM5 using standard molecular biology techniques. The transformants were 
15 verified by its DNA RFLP pattern. 

EXAMPLE 1 
PRODUCTION OF GLYCEROL FROM F. COT J 
TRANSFORMED WITH G3PDH GENE 

Media 

20 Synthetic media was used for anaerobic or aerobic production of glycerol 

using £. coli cells transformed with pDARlA. The media contained per liter 
6.0 g Na 2 HP0 4 , 3.0 g KH 2 P0 4 , 1.0 g NH 4 C1, 0.5 g NaCl, 1 mL 20% 
MgS0 4 .7H 2 0, 8.0 g glucose, 40 mg casamino acids, 0.5 ml 1% thiamine 
hydrochloride, 100 mg ampicillin. 

25 Growth Conditio^ 

Strain AA200 harboring pDARIA or the pTrc99A vector was grown in 
aerobic conditions in 50 mL of media shaking at 250 rpm in 250 mL flasks at 
37 °C. At A 600 0.2-0.3 isopropylthio-P-D-galactoside was added to a final 
concentration of 1 mM and incubation continued for 48 h. For anaerobic 
30 growth samples of induced cells were used to fill Falcon #2054 tubes which 
were capped and gently mixed by rotation at 37 °C for 48 h. Glycerol 
production was determined by HPLC analysis of the culture supernatants. 
Strain pDARl A/AA200 produced 0.38 g/L glycerol after 48 h under anaerobic 
conditions, and 0.48 g/L under aerobic conditions. 
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EXAMPLE 2 
PRODUCTION OF GLYCEROL FROM E. CPU 
TRANSFORMED WITH G3P PHOSPHATASE GENE (GPP2^ 

Media 

5 Synthetic phoA media was used in shake flasks to demonstrate the 

increase of glycerol by GPP2 expression in E. coli. The phoA medium 
contained per liter: Amisoy, 12 g; ammonium sulfate, 0.62 g; MOPS, 10.5 g; 
Na-citrate, 1.2 g; NaOH (1 M), 10 mL; 1 M MgS0 4 , 12 mL; 100X trace 
elements, 12 mL; 50% glucose, 10 mL; 1% thiamine, 10 mL; 100 mg/mL 

10 L-proline, 10 mL; 2.5 mM FeCl 3 , 5 mL; mixed phosphates buffer, 2 mL (5 mL 
0.2 M NaH 2 P0 4 + 9 mL 0.2 M K 2 HP0 4 ), and pH to 7.0. The 100X traces 
elements for phoA medium /L contained: ZnS0 4 -7 H 2 0, 0.58 g; MnS0 4 H 2 0, 
0.34 g; CuS0 4 -5 H 2 0, 0.49 g; CoCl 2 6 H 2 0, 0.47 g; H 3 B0 3 , 0.12 g, 
NaMo0 4 *2 H 2 0, 0.48 g. 

15 Shake Flasks Experiments 

The strains pAH21/DH5a (containing GPP2 gene) and pPH0X2/DH5a 
(control) were grown in 45 mL of media (phoA media, 50 ug/mL carbenicillin, 
and 1 ug/mL vitamin B 12 ) in a 250 mL shake flask at 37 °C. The cultures were 
grown under aerobic condition (250 rpm shaking) for 24 h. Glycerol production 

20 was determined by HPLC analysis of the culture supernatant. pAH21/DH5a 
produced 0.2 g/L glycerol after 24 h. 

EXAMPLE 3 

PRODUCTION OF GLYCEROL FROM D-GLUCOSE USING 
RECOMBINANT E. CPU CONTAINING BOTH GPP2 AND PARI 

25 Growth for demonstration of increased glycerol production by E. coli 

DH5a-containing pAH43 proceeds aerobically at 37 °C in shake-flask cultures 
(erlenmeyer flasks, liquid volume l/5th of total volume). 

Cultures in minimal media/ 1% glucose shake-flasks are started by 
inoculation from overnight LB/1% glucose culture with antibiotic selection. 

30 Minimal media are: filter-sterilized defined media, final pH 6.8 (HC1), 

contained per liter: 12.6 g (NH^SO^ 13.7 g K 2 HP0 4 , 0.2 g yeast extract 
(Difco), 1 g NaHC0 3 , 5 mg vitamin B 12 , 5 mL Modified Balch's Trace-Element 
Solution (the composition of which can be found in Methods for General an^ 
Molecular B acteriology (P. Gerhardt et al., eds, p. 158, American Society for 

35 Microbiology, Washington, DC (1994)). The shake-flasks are incubated at 

37 °C with vigorous shaking for overnight, after which they are sampled for GC 
analysis of the supernatant. The pAH43/DH5a showed glycerol production of 
3.8 g/L after 24 h. 
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EXAMPLE 4 

PRODUCTION OF GLYCEROL FROM D-GLUCOSE TJSTNfi 
RECOMBINANT E. COLI CONTAINING BOTH GPP2 AND PARI 
Example 4 illustrates the production of glucose from the recombinant 
5 £. coli DH5a/pAH48 t containing both the GPP2 and DAR1 genes. 

The strain DH5a/pAH48 was constructed as described above in the 
GENERAL METHODS. 
Pre-Culture 

DH5a/pAH48 were pre-cultured for seeding into a fermentation run. 
10 Components and protocols for the pre-culture are listed below. 

Pre-Culture Media 
KH 2 P0 4 30.0 g/L 

Citric acid 2.0 g/L 

MgS0 4 .7H 2 0 2.0 g/L 

15 98%H 2 S0 4 2.0mL/L 

Ferric ammonium citrate 0.3 g/L 

CaCl 2 -2H 2 0 0.2 g/L 

Yeast extract 5 . 0 g/L 

Trace metals 5 . 0 mL/L 

20 Glucose 10.0 g/L 

Carbenicillin 100.0 mg/L 

The above media components were mixed together and the pH adjusted 
to 6.8 with NH 4 OH. The media was then filter sterilized. 

Trace metals were used according to the following recipe: 

25 



30 



35 



Citric acid, monohydrate 


4.0 g/L 


MgS0 4 -7H 2 0 


3.0 g/L 


MnS04H 2 0 


0.5 g/L 


NaCl 


1.0 g/L 


FeS04-7H 2 0 


0.1 g/L 


CoC12 6H 2 0 


0.1 g/L 


CaCI 2 


0.1 g/L 


ZnS0 4 -7H 2 0 


0.1 g/L 


CuS0 4 -5 H 2 0 


10 mg/L 


A1K(S0 4 ) 2 -12H 2 0 


10 mg/L 


H3BO3 


10 mg/L 


Na 2 Mo0 4 -2H 2 0 


10 mg/L 


NiS04-6H 2 0 


10 mg/L 


Na 2 Se0 3 


10 mg/L 


Na 2 W0 4 -2H 2 0 


10 mg/L 
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Cultures were started from seed culture inoculated from 50 |iL frozen 
stock (15% glycerol as cryoprotectant) to 600 mL medium in a 2-L Erlenmeyer 
flask. Cultures were grown at 30 °C in a shaker at 250 rpm for approximately 
12 h and then used to seed the fermenter. 
5 Fermentation growth 
Vessel 

15-L stirred tank fermenter 
Medium 



KH 2 P0 4 


6.8 g/L 


Citric acid 


2.0 g/L 


MgS0 4 -7H 2 0 


2.0 g/L 


98% H 2 S0 4 


2.0 mL/L 


Ferric ammonium citrate 


0.3 g/L 


CaCl 2 -2H 2 C 


0.2 g/L 


Mazu DF204 antifoam 


1.0 mL/L 



The above components were sterilized together in the fermenter vessel. 

The pH was raised to 6.7 with NH 4 OH. Yeast extract (5 g/L) and trace metals 

solution (5 mL/L) were added aseptically from filter sterilized stock solutions. 

Glucose was added from 60% feed to give final concentration of 10 g/L. 
20 Carbenicillin was added at 100 mg/L. Volume after inoculation was 6 L. 

Environmental Conditions For Fermentation 

The temperature was controlled at 36 °C and the air flow rate was 

controlled at 6 standard liters per minute. Back pressure was controlled at 

0.5 bar. The agitator was set at 350 rpm. Aqueous ammonia was used to 
25 control pH at 6.7. The glucose feed (60% glucose monohydrate) rate was 

controlled to maintain excess glucose. 
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Results 

The results of the fermentation run are given in Table 1 . 

Table 1 



EFT 
(hr) 


OD550 
(AU) 


[Glucose] 


[Glycerol] 
WW 


Total Glucose 
Fed fe) 


Total Glycerol 
Produced (fl) 


0 


0.8 


9.3 




25 




6 


4.7 


4.0 


2.0 


49 


14 


8 


5.4 


0 


3.6 


71 


25 


10 


6.7 


0.0 


4.7 


116 


33 


12 


7.4 


2.1 


7.0 


157 


49 


14.2 


10.4 


0.3 


10.0 


230 


70 


16.2 


18.1 


9.7 


15.5 


259 


106 


18.2 


12.4 


14.5 




305 




20.2 


11.8 


17.4 


17.7 


353 


119 


22.2 


11.0 


12.6 




382 




24.2 


10.8 


6.5 


26.6 


404 


178 


26.2 


10.9 


6.8 




442 




28.2 


10.4 


10.3 


31.5 


463 


216 


30.2 


10.2 


13.1 


30.4 


493 


213 


32.2 


10.1 


8.1 


28.2 


512 


196 


34.2 


10.2 


3.5 


33.4 


530 


223 


36.2 


10.1 


5.8 




548 




38.2 


9.8 


5.1 


36.1 


512 


233 



5 EXAMPLE 5 

ENGINEERING OF GLYCEROL KINASE MUTANTS OF E. COLI FMi 
FOR PRODUCTION OF GLYCEROL FROM GLUCOSE 
Construction of i ntegration plasmid for glycerol kinase gene replacement in 
E. coli FMg 

10 E. coli FM5 genomic DNA was prepared using the Puregene DNA 

Isolation Kit (Gentra Systems, Minneapolis, MN). A 1.0 kb DNA fragment 
containing partial glpF and glycerol kinase (glpK) genes was amplified by PCR 
(Mullis and Faloona, Methods EnzymoL, 155:335-350, 1987) from FM5 
genomic DNA using primers SEQ ID NO:26 and SEQ ID NO:27. A 1. 1 kb 

15 DNA fragment containing partial glpK and glpX genes was amplified by PCR 
from FM5 genomic DNA using primers SEQ ID NO:28 and SEQ ID NO:29. A 
Muni site was incorporated into primer SEQ ID NO:28. The 5' end of primer 
SEQ ID NO:28 was the reverse complement of primer SEQ ID NO:27 to enable 
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subsequent overlap extension PCR. The gene splicing by overlap extension 
technique (Horton et al,, BioTechniques, 8:528-535, 1990) was used to generate 
a 2.1 kb fragment by PCR using the above two PCR fragments as templates and 
primers SEQ ID NO:26 and SEQ ID NO:29. This fragment represented a 
5 deletion of 0.8 kb from the central region of the 1 .5 kb glpK gene. Overall, this 
fragment had 1.0 kb and 1.1 kb flanking regions on either side of the Muni 
cloning site (within the partial glpK) to allow for chromosomal gene replacement 
by homologous recombination. 

The above 2. 1 kb PCR fragment was blunt-ended (using mung bean 

10 nuclease) and cloned into the pCR-Blunt vector using the Zero Blunt PCR 
Cloning Kit (Invitrogen, San Diego, CA) to yield the 5.6 kb plasmid pRNlOO 
containing kanamycin and Zeocin resistance genes. The 1.2 kb HinoU fragment 
from pLoxCatl (unpublished results), containing a chloramphenicol-resistance 
gene flanked by bacteriophage PI loxP sites (Snaith et al., Gene, 166:173-174, 

15 1995), was used to interrupt the glpK fragment in plasmid pRNlOO by ligating it 
to Mwil-digested (and blunt-ended) plasmid pRNlOO to yield the 6.9 kb plasmid 
pRN101-l. A 376 bp fragment containing the R6K origin was amplified by 
PCR from the vector pGP704 (Miller and Mekalanos, J. BacterioL, 
170:2575-2583, 1988) using primers SEQ ID NO:30 and SEQ ID NO:31, blunt- 

20 ended, and ligated to the 5.3 kb Aspl\%-AatTL fragment (which was blunt- 
ended) from pRN101-l to yield the 5.7 kb plasmid pRN102-l containing 
kanamycin and chloramphenicol resistance genes. Substitution of the CoIEl 
origin region in pRN101-l with the R6K origin to generate pRN102-l also 
involved deletion of most of the Zeocin resistance gene. The host for pRN102-l 

25 replication was £. coli SY327 (Miller and Mekalanos, /. BacterioL , 

170:2575-2583, 1988) which contains the pir gene necessary for the function of 
the R6K origin. 

Engineering Of Gl ycerol Kinase Mutant RJFlOm With Chloramphenicol 
Resistance Gene Interrupt 

30 E. coli FM5 was electrotransformed with the non-replicative integration 

plasmid pRN102-l and transformants that were chloramphenicol-resistant 
(12.5 jig/mL) and kanamycin-sensitive (30 jig/mL) were further screened for 
glycerol non-utilization on M9 minimal medium containing 1 mM glycerol. An 
EcdRl digest of genomic DNA from one such mutant, RJFlOm, when probed 

35 with the intact glpK gene via Southern analysis (Southern, 7. Mol. Biol. , 

98:503-517, 1975) indicated that it was a double-crossover integrant (glpK gzne 
replacement) since the two expected 7.9 kb and 2.0 kb bands were observed, 
owing to the presence of an additional EcoRI site within the chloramphenicol 
resistance gene. The wild-type control yielded the single expected 9.4 kb band. 
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A 13 C NMR analysis of mutant RJFlOm confirmed that it was incapable of 
convening 13 C-labeled glycerol and ATP to glycerol-3-phosphate. This glpK 
mutant was further analyzed by genomic PCR using primer combinations SEQ 
ID NO:32 and SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35, and SEQ 
5 ID NO:32 and SEQ ID NO:35 which yielded the expected 2.3 kb, 2.4 kb, and 
4.0 kb PCR fragments respectively. The wild-type control yielded the expected 
3.5 kb band with primers SEQ ID NO:32 and SEQ ID NO:35. The glpK mutant 
RJFlOm was electrotransformed with plasmid pAH48 to allow glycerol 
production from glucose. The glpK mutant E. coli RJFlOm has been deposited 
10 with ATCC under the terms of the Budapest Treaty on 24 November 1997. 
Engineering Of Glycerol Kinase Mutant RJF10 With Chloramphenicol 
Resistance Gene Interrupt Removed 

After overnight growth on YENB medium (0.75% yeast extract, 0.8% 
nutrient broth) at 37 °C, E. coli RJFlOm in a water suspension was 
15 electrotransformed with plasmid pJW168 (unpublished results), which contained 
the bacteriophage PI Cre recombinase gene under the control of the IPTG- 
inducible lacUVS promoter, a temperature-sensitive pSClOl replicon, and an 
ampicillin resistance gene. Upon outgrowth in SOC medium at 30 °C, 
transformants were selected at 30 °C (permissive temperature for pJW168 
20 replication) on LB agar medium supplemented with carbenicillin (50 jig/mL) and 
IPTG (1 mM). Two serial overnight transfers of pooled colonies were carried 
out at 30 °C on fresh LB agar medium supplemented with carbenicillin and 
IPTG in order to allow excision of the chromosomal chloramphenicol resistance 
gene via recombination at the fox? sites mediated by the Cre recombinase 
25 (Hoess and Abremski, J. Mol Biol, 181:351-362, 1985). Resultant colonies 
were replica-plated on to LB agar medium supplemented with carbenicillin and 
IPTG and LB agar supplemented with chloramphenicol (12.5 ^ig/mL) to identify 
colonies that were carbenicillin-resistant and chloramphenicol-sensitive 
indicating marker gene removal. An overnight 30 °C culture of one such colony 
30 was used to inoculate 10 mL of LB medium. Upon growth at 30 °C to OD 

(600 nm) of 0.6, the culture was incubated at 37 °C overnight. Several dilutions 
were plated on prewarmed LB agar medium and the plates incubated overnight 
at 42 °C (the non-permissive temperature for pJW168 replication). Resultant 
colonies were replica-plated on to LB agar medium and LB agar medium 
35 supplemented with carbenicillin (75 ^ig/mL) to identify colonies that were 
carbenicillin-sensitive indicating loss of plasmid pJW168. One such glpK 
mutant, RJF10, was further analyzed by genomic PCR using primers SEQ ID 
NO:32 and SEQ ID NO:35 and yielded the expected 3.0 kb band confirming 
marker gene excision. Glycerol non-utilization by mutant RJF10 was confirmed 
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by lack of growth on M9 minimal medium containing 1 mM glycerol. The glpK 
mutant RJF10 was electro transformed with plasmid pAH48 to allow glycerol 
production from glucose. 

EXAMPLE 6 

5 CONSTRUCTION OF E. COLI STRAIN WITH GIB A GENE KNOCKOUT 
The gldA gene was isolated from E. coli by PCR (K. B. Mullis and F. A. 
Faloona (1987) Meth. Enzymol. 155:335-350) using primers SEQ ID NO:36 
and SEQ ID NO: 37, which incorporate terminal Sphl and Xbal sites, 
respectively, and cloned (T. Maniatis 1982 Molecular Cloning. A Laboratory 
10 Manual. Cold Spring Harbor, Cold Spring Harbor, NY) between the Sphl and 
Xbal sites in pUC18, to generate pKP8. pKP8 was cut at the unique Sail and 
Ncol sites within the gldA gene, the ends flushed with Klenow and religated, 
resulting in a 109 bp deletion in the middle of gldA and regeneration of a unique 
Sail site, to generate pKP9. A 1.4 kb DNA fragment containing the gene 
15 conferring kanamycin resistance (kan), and including about 400 bps of DNA 
upstream of the translational start codon and about 100 bps of DNA downstream 
of the translational stop codon, was isolated from pET-28a(+) (Novagen, 
Madison, Wis) by PCR using primers SEQ ID NO:38 and SEQ ID NO:39, 
which incorporate terminal Sail sites, and subcloned into the unique Sail site of 
20 pKP9, to generate pKP13. A 2.1 kb DNA fragment beginning 204 bps 

downstream of the gldA translational start codon and ending 178 bps upstream of 
the gldA translational stop codon, and containing the kan insertion, was isolated 
from pKP13 by PCR using primers SEQ ID NO:40 and SEQ ID NO:41, which 
incorporate terminal Sphl and Xbal sites, respectively, was subcloned between 
25 the Sphl and Xbal sites in pMAK705 (Genencor International, Palo Alto, 
Calif.), to generate pMP33. E. coli FM5 was transformed with pMP33 and 
selected on 20 ug/mL kan at 30 °C, which is the permissive temperature for 
pMAK705 replication. One colony was expanded overnight at 30 °C in liquid 
media supplemented with 20 ug/mL kan. Approximately 32,000 cells were 
30 plated on 20 ug/mL kan and incubated for 16 hrs at 44 °C, which is the 

restrictive temperature for pMAK705 replication. Transformants growing at 
44 °C have plasmid integrated into the chromosome, occuring at a frequency of 
approximately 0.0001. PCR and Southern blot (E.M. Southern 1975 /. MoL 
Biol. 98:503-517) analyses were used to determine the nature of the 
35 chromosomal integration events in the transformants. Western blot analysis 
(H. Towbin, et al. (1979) Proc. Natl. Acad, Sci. 76:4350) was used to 
determine whether glycerol dehydrogenase protein, the product of gldA, is 
produced in the transformants. An activity assay was used to determine whether 
glycerol dehydrogenase activity remained in the transformants. Activity in 
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glycerol dehydrogenase bands on native gels was determined by coupling the 
conversion of glycerol + NAD(+) dihydroxyacetone + NADH to the 
conversion of a tetrazolium dye, MTT [3-(4,5-dimethylthiazol-2-yl)-2,5- 
diphenyltetrazolium bromide] to a deeply colored formazan, with phenazine 
5 methosulfate as mediator. Glycerol dehydrogenase also requires the presence of 
30 mM ammonium sulfate and 100 mM Tris, pH 9 (C.-T. Tang, et al. (1997) 
J. BacterioL 140:182). Of 8 transformants analyzed, 6 were determined to be 
gldA knockouts. E. coli MSP33.6 has been deposited with ATCC under the 
terms of the Budapest Treaty on 24 November 1997, 
10 EXAMPLE 7 

CONSTRUCTION OF E COLI STRAIN 
WITH GLPK AND GLDA GENE KNOCKOUTS 
A 1 .6 kb DNA fragment containing the gldA gene and including 228 bps 
of DNA upstream of the translational start codon and 220 bps of DNA 
15 downstream of the translational stop codon was isolated from E. coli by PCR 
using primers SEQ ID NO:42 and SEQ ID NO:43, which incorporate terminal 
Sphl and Xbal sites, respectively, and cloned between the Sphl and Xbal sites of 
pUC18, to generate pQN2. pQN2 was cut at the unique Sail and Ncol sites 
within the gldA gene, the ends flushed with Klenow and religated, resulting in a 
20 1 09 bps deletion in the middle of gldA and regeneration of a unique Sal 1 site, to 
generate pQN4. A 1.2 kb DNA fragment containing the gene conferring 
kanamycin resistance (kan), and flanked by loxP sites was isolated from 
pLoxKan2 (Genencor International, Palo Alto, Calif.) as a Stul/Xhol fragment, 
the ends flushed with Klenow, and subcloned into pQN4 at the Sail site after 
25 flushing with Klenow, to generate pQN8. A 0.4 kb DNA fragment containing the 
R6K origin of replication was isolated from pGP704 (Miller and Mekalanos, 
J. BacterioL, 170:2575-2583, 1988) by PCR using primers SEQ ID NO:44 and 
SEQ ID NO:45, which incorporate terminal Sphl and Xbal sites, respectively, 
and ligated to the 2.8 kb Sphl/Xbal DNA fragment containing the gldA::kan 
30 cassette from pQN8, to generate pKP22. A 1 .0 kb DNA fragment containing the 
gene conferring chloramphenicol resistance (cam), and flanked by loxP sites was 
isolated from pLoxCat2 (Genencor International, Palo Alto, Calif.) as an Xbal 
fragment, and subcloned into pKP22 at the Xbal site, to generate pKP23. E. coli 
strain RJF10 (see EXAMPLE 5), which is glpK-, was transformed with pKP23 
35 and transformants with the phenotype kanRcamS were isolated, indicating double 
crossover integration, which was confirmed by southern blot analysis. Glycerol 
dehydrogenase gel activity assays (as described in EXAMPLE 6) demonstrated 
that active glycerol dehydrogenase was not present in these transformants. The 
kan marker was removed from the chromosome using the Cre-producing plasmid 
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pJW168, as described in EXAMPLE 5, to produce strain KLP23. Several isolates 
with the phenotype kanS demonstrated no glycerol dehydrogenase activity, and 
southern blot analysis confirmed loss of the kan marker. 

5 SEQ ID NO:44: 

CACGCATGCAGTTCAACCTGTTGATAGTAC 



SEQ ID NO:45: 

GCGTCTAGATCCTTTTAAATTAAAAATG 
10 EXAMPLE 8 

CONSUMPTION OF GLYCEROL PRODUCED FROM D-GLUCOSE BY 
RECOMBINANT E. CPU CONTAINING BOTH GPP2 AND PARI WITH 
AND WITHOUT GLYCEROL KINASE (GLPK) ACTIVITY 
EXAMPLE 8 illustrates the consumption of glycerol by the recombinant 
15 E. coli FM5/pAH48 and RJF10/pAH48. The strains FM5/pAH48 and 
RJF10/pAH48 were constructed as described above in the GENERAL 
METHODS. 
Pre-Culture 

FM5/pAH48 and RJF10/pAH48 were pre-cultured for seeding a 
20 fermenter in the same medium used for fermentation, or in LB supplemented 
with 1 % glucose. Either carbenicillin or ampicillin were used (100 mg/L) for 
plasmid maintenance. The medium for fermentation is as described in 
EXAMPLE 4. 

Cultures were started from frozen stocks (15% glycerol as 
25 cryoprotectant) in 600 mL medium in a 2-L Erlenmeyer flask t grown at 30 °C 

in a shaker at 250 rpm for approximately 12 h, and used to seed the fermenter. 

Fermentation growth 

A 15-L stirred tank fermenter with 5-7 L initial volume was prepared as 

described in EXAMPLE 4. Either carbenicillin or ampicillin were used 
30 (100 mg/L) for plasmid maintenance. 

Environmental C onditions to Evaluate Glycerol Kinase fGfcK) Activity 

The temperature was controlled at 30 °C and the air flow rate controlled 

at 6 standard liters per minute. Back pressure was controlled at 0.5 bar. 

Dissolved oxygen tension was controlled at 10% by stirring. Aqueous ammonia 
35 was used to control pH at 6.7. The glucose feed (60% glucose) rate was 

controlled to maintain excess glucose until glycerol had accumulated to at least 

25 g/L. Glucose was then depleted, resulting in the net metabolism of glycerol. 

Table 2 shows the resulting conversion of glycerol. 
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Table 2 

Conversion of glycerol bv FM5/pAH48 (wt) and RJF10/dAH48 (elpK) 

rate of glycero 
consumption 

Strain number of examples g/OD/hr 

FM5/pAH48 2 0.095 ± 0.015 

RJF10/pAH48 3 0.021 ± 0.011 

As is seen by the data in Table 2, the rate of glycerol consumption 
decreases about 4-5 fold where endogenous glycerol kinase activity is 
eliminated. 

5 Environmental Conditions to Evaluate Glycerol Dehydrogenase (GldA) Activity 
The temperature was controlled at 30 °C and the air flow rate controlled 
at 6 standard liters per minute. Back pressure was controlled at 0.5 bar. 
Dissolved oxygen tension was controlled at 10% by stirring. Aqueous ammonia 
was used to control pH at 6.7, In the first fermentation, glucose was kept in 
10 excess for the duration of the fermentation. The second fermentation was 
operated with no residual glucose after the first 25 hours. Samples over time 
from thetwo fermentations were taken for evaluation of GlpK and GldA 
activities. Table 3 summarizes RJF10/pAH48 fermentations that show the 
effects of GldA on selectivity for glycerol. 

15 

Tablg 3 

GldA and GbK activitities from two RJF10/dAH48 fermentations 



Time Overall selectivity 

Fermentation (hrs) GldA GlpK (g/g) 

1 25 42% 

46 - 49% 

61 + - 54% 



2 25 + 41% 

46 ++ - 14% 

61 ++ - 12% 

As is seen by the data in Table 3, the presence of glycerol dehydrogenase 
(GldA) activity is linked to the conversion of glycerol under glucose-limited 
conditions; thus, it is anticipated that eliminating glycerol dehydrogenase 
20 activity will reduce glycerol conversion. 
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EXAMPLE 9 



PRODUCTION OF GLYCEROL FROM D-GLUCOSE USING 
RECOMBINANT E. BLATTAE CONTAINING 
BOTH GPP2 AND PARI 



5 



Example 9 illustrates the production of glycerol from D-glucose from 



recombinant E. blattae containing both GPP2 and DAR1 genes. 

E. blattae, obtained from the ATCC and having ATCC accession number 
33429, was grown at 30 °C until the culture reached an OD of about 0.6 AU at 
600 nm. The culture was then transformed with pAH48, a plasmid comprising 

10 GPP2 and DAR1 genes (described in WO 98/21341), using electroporation 
techniques. The transformants were confirmed by DNA RFLP pattern and 
antibiotic resistance (200 ug/mL carbenicillin). 

The transformed E. blattae was grown aerobically at 35 °C in shake- 
flask cultures. The cultures were grown in a defined medium plus 2% glucose 

15 with antibiotic selection and were started by inoculation from an overnight 
culture grown in LB plus 1 % glucose with antibiotic selection. The defined 
medium contained per liter: 27.2 g KH 2 P0 4 , 2 g citric acid, 2 g MgS0 4 7H 2 0, 
1.2 ml 98% H 2 S0 4 , 0.3 g ferric ammonium citrate, 0.2 g CaCI 2 2H 2 0, 10 g 
yeast extract (Difco), 5 mL Modified Balch's Trace-Element Solution (the 

20 composition of which can be found in Methods for General and Molecular 
Bacteriology (P. Gerhardt et al., eds, p. 158, American Society for 
Microbiology, Washington, DC, (994)). The defined medium was filter- 
sterilized and adjusted to a final pH 6.8 with NH4OH. The shake-flasks were 
incubated at 35 °C overnight with vigorous shaking. The supernatant was then 

25 subjected to HPLC analysis for the presence of glycerol. After the overnight 
incubation, the E. blattae containing pAH48 produced 7.63 g/L of glycerol. 
The control, which was wild-type £. blattae (ATCC 33429) grown under the 
same conditions, produced = 0.2 g/L of glycerol. 

EXAMPLE 10 

30 PRODUCTION OF GLYCEROL FROM D-GI .TIPIOST? 

USING RECOMBINANT g. CPU DE FICIENT IN GLQA AND GT PK 
AND CONTAI NING BOTH GPP2 AND PARI INTEGRATED 

INTO THE CHROMOSOME 
This Example illustrates the production of glycerol from D glucose from 
35 recombinant E. coli with gldA and glpK gene knockouts and containing both 
GPP2 and DAR1 encoding genes integrated into the host cell chromosome. 

E. coli strain KLP23, prepared as described in Example 7, is deficient in 
both glycerol kinase (product of glpK) and glycerol dehydrogenase (product of 



36 




WO 99/28480 PCT7US98/25551 

gldA) activities. KLP23 containing DAR1, GPP2 and a loxP flanked 
chloramphenicol resistant gene integrated into the chromosome at the ampC 
location was prepared and is referred to as AH76RIcm. 

Integration plasmids were designed and constructed based on a cre-lox 
5 integration system (Hoess, supra). In order to create the integration plasmids, a 
Hind III - Smal fragment of pLoxCatl was inserted into Hind in and Sma I 
linearized pAH48 to create pAH48cm2. The pAH48 plasmid contains DAR1 
and GPP2 genes expressed under the control of the trc promoter. The 3.5 kb 
ApaL I fragment of pAH48cm2 was blunt ended with T4 DNA polymerase 

10 (Boehringer Mannheim Biochemical) and dNTPs and inserted into Nrul 

linearized plnt-ampC (Genencor International, CA), using E coli SY327 (Miller 
et al., J, BacterioL 170:2575-2583, 1998) as a host to create pAH76 and 
pAH76R. The "R" means reverse orientation of the integration cassette. Both 
plasmids, pAH76 and pAH76R, contain a R6K origin of replication and are not 

15 able to replicate in KLP23. The plasmids pAH76 and pAH76R were used to 
transform KLP23 for integration at the ampC location of the E. coli 
chromosome. The transformants were selected on 10 ug/ml of chloramphenicol 
and were kanamycin sensitive, yielding double crossover integration. These 
E. coli transformants are named AH76Icm and AH76RIcm. 

20 AH76RIcm cultures were grown in shake-flasks in defined medium 

(described in Example 9) plus 2.5% glucose started by inoculation from an 
overnight LB culture having 1 % glucose and antibiotic selection. The shake- 
flasks (erlenmeyer flasks, liquid volume l/5 th of total volume) were incubated at 
37 °C with vigorous shaking overnight, after which the supernatant was sampled 

25 for glycerol using a colormetric enzyme assay (Sigma, Procedure No. 337) on a 
Monarch 2000 instrument (Instrumentation Laboratory Co., Lexington, MA ). 
AH77RIcm showed glycerol production of 6.7 g/L after 25 hr. 

£. coli pAH76RI has the chloramphenicol gene deleted from AH76RIcm. 
The chloramphenicol gene was deleted from the chromosome using the 

30 Cre-producing plasmid, pJW168, as described in Example 5. The transformants 
were selected for carbenicillin resistance and chloramphenicol sensitivity under 
1 mM IPTG induction at 30 °C. After removal of the chloramphenicol gene, 
AH76RI was grown on LB medium without any antibiotics to cure pJW168. 
The final version of AH76RI is not able to grow on chloramphenicol or 

35 carbenicillin selection. 

AH76RI cultures were grown in shake-flasks in a defined media plus 2% 
glucose started by inoculation from an overnight LB/1% glucose culture. The 
shake-flasks were incubated at 35 °C with vigorous shaking overnight, after 
which the supernatant was sampled for glycerol using a colormetric assay 
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(Sigma, Procedure No. 337) on a Monarch 2000 instrument (Instrumentation 
Laboratory Co. Lexington, MA ). AH77RI showed glycerol production of 
4.6 g/L after 24 hr. 

All the plasmids described in this example were transformed into 
5 E. coli KLP23 using standard molecular biology techniques. The 

transformants were verified by DNA RFLP pattern, antibiotic resistance, 
PCR amplification, or G3P phosphatase assay. 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRuIe \ lbis) 



A. The indications made below relate to the microorganism referred to in the description 
on page S , line 20 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet Q 



Name of depositary institution 
AMERICAN TYPE CULTURE COLLECTION 



Address of depositary institution (including postal code and country) 

1080L University Blvd. 
Manassas, Virginia 20110-2209 
USA 



Date of deposit 

26 September 1996 


Accession Number 
ATCC98187 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet | | 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(A) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau later (specify tJte general nature of the indications e.g., "Accession 
Number of Deposit") 



j ^l This sheet was received with the international application 



For receiving Office use only 



Authorized officer 




For International Bureau use onlv 



[~1 This sheet was received by the International Bureau < 



Authorized officer 



l-orm PCT/RO/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRule \2bLr) 



A. The indications made below relate to the microorganism referred to in the description 
on page 5 , line 21 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet | | 



Name of depositary institution 

AMERICAN TYPE CULTURE COLLECTION 



Address of depositary institution (including postal cod* and country) 

10801 University Blvd. 
Manassas > Virginia 20110-2209 
USA 



Date of deposit 

6 November 1996 


Accession Number 

ATCC98248 


C. ADDITIONAL INDICATIONS (Uavt blank if not applicable) This information is continued on an additional sheet | [ 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(4) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for oil designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the international Bureau later (speedy the general nature of the ituiications e g "Accession 
Number of Deposit'*) 



For receiving Office use only 



sheet was received with the international application 



Authorized officer 



«-^orm PCT/RO/ 134 (July I9tf) 




For International Bureau use onlv 



I I This sheet was received by the International Bu 



Authorized officer 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRulc \lbis) 

A. The indications made below relate to the microorganism referred to in the description 

on page 5 .line 22 

B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet Q 

Name of depositary institution 

AMERICAN TYPE CULTURE COLLECTION 

Address of depositary institution (including postal code and country) 
10801 University Blvd. 
Manassas, Virginia 20110-2209 
USA 



Date of deposit 


Accession Number 




25 November 1997 


ATCC98597 





C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information us continued on an additional sheet [H 



In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until 
the publication of the mention of the grant of the European patent or 
until the date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(4) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (lea* blank if not applicable) 



For receiving Office use only 



Tim sheet was received with the international application Q This sheet was received by the Intemanonal Bureau < 




For International Bureau use onlv 



'Authorized ofTicer 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCT Rule 136/.?) 



A. The indications made below relate to the microorganism referred (o 


in the description 


on page S , line 23 




B. IDENTIFICATION OF DEPOSIT 


Further deposits are identified on an additional sheet Q 


Name of depositary institution 




AMERICAN TYPE CULTURE COLLECTION 




Address of depositary institution (including postal code and country) 




10801 University Blvd. 




Manassas, Virginia 20110-2209 




USA 





Dale of deposit 

25 November 1997 



Accession Number 

ATCC98598 



C. ADDITIONAL INDICATIONS (Ua* t blank if no, applicable This information is continued on an additional sheet Q 



In respect of those designations in which a European patent is sought 
a sample of the deposited microorganism will be made available until ' 
the publication of the mention of the grant of the European patent or 
until the date on which the application has been refused or withdrawn 
or is deemed to be withdrawn, only by the issue of such a sample to an 
expert nominated by the person requesting the sample. (Rule 28(4) EPC) 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if,l K .ndicanonj or, no, for all disced StaUs) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not oppl •cable) 



lll^'tf^T bt ' QW W "' * ***** 10 """"*"»■»' Bw«« later (^,he g eneral nature of^M.canons e.g.. -Access™ 



For receiving Office use only 



ffil ™s was received with the international application 




Form PCT/RO/134 (July 1992) 



— — For International Bureau use only 
Q This sheet was received by the International 



Bureau on: 



Authorized officer 
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WHAT IS CLAIMED IS: 

1 . A method for the production of glycerol from a recombinant 
organism comprising: 

(i) transforming a suitable host cell with an expression cassette 
5 comprising either one or both of 

(a) a gene encoding a protein having glycerol-3-phosphate 
dehydrogenase activity, and 

(b) a gene encoding a protein having glycerol-3-phosphate 
phosphatase activity, 

10 the suitable host cell having a disruption in either one or both of 

(a) an endogenous gene encoding a polypeptide having 
glycerol kinase activity, and 

(b) an endogenous gene encoding a polypeptide having 
glycerol dehydrogenase activity , 

15 wherein the disruption prevents the expression of active gene product; 

(ii) culturing the transformed host cell of (i) in the presence of at 
least one carbon source selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and single-carbon substrates, whereby 
glycerol is produced; and 

20 (iii) optionally recovering the glycerol produced in (ii). 

2. The method of Claim 1 wherein the expression cassette comprises a 
gene encoding a glycerol-3-phosphate dehydrogenase enzyme. 

3. The method of Claim 1 wherein the expression cassette comprises a 
gene encoding a glycerol-3-phosphate phosphatase enzyme. 

25 4. The method of Claim 1 wherein the expression cassette comprises 

genes encoding a glycerol-3-phosphate phosphatase enzyme and a giycerol-3- 
phosphate dehydrogenase enzyme. 

5. The method of Claim 1 wherein the host cell contains a disruption in 
a gene encoding an endogenous glycerol kinase enzyme wherein the disruption 

30 prevents the expression of active gene product. 

6. The method of Claim 1 wherein the host cell contains a disruption in 
a gene encoding an endogenous glycerol dehydrogenase enzyme wherein the 
disruption prevents the expression of active gene product. 

7. The method of Claim 1 wherein the host cell contains a) a disruption 
35 in a gene encoding an endogenous glycerol kinase enzyme and b) a disruption in 

a gene encoding an endogenous glycerol dehydrogenase enzyme, wherein the 
disruptions in the respective genes prevent the expression of active gene product 
from either gene. 
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8. The method of Claim 1 wherein the suitable host cell is selected 
from the group consisting of bacteria, yeast, and filamentous fungi. 

9. The method of Claim 8 wherein the suitable host cell is selected 
from the group consisting of Citrobacter, Enterobacter \ Clostridium, Klebsiella, 

5 Aerobacter, Lactobacillus, Aspergillus, Saccharomyces, Schizosaccharomyces, 
Zygosaccharomyces, Pichia, Kluyveromyces, Candida, Hansenula, 
Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, 
Bacillus, Streptomyces, and Pseudomonas. 

10. The method of Claim 9 wherein the suitable host cell is £. coli or 
10 Saccharomyces sp, 

11. The method of Claim 1 wherein the carbon source is glucose. 

12. The method of Claim 1 wherein the protein having glycerol-3- 
phosphate dehydrogenase activity corresponds to amino acid sequences selected 
from the group consisting of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, 

15 SEQ ID NO: 10, SEQ ID NO: 11, and SEQ ID NO: 12 and wherein the amino 
acid sequences encompasses amino acid substitutions, deletions or insertions that 
do not alter the functional properties of the enzyme. 

13. The method of Claim 1 wherein the protein having giycerol-3- 
phosphatase activity corresponds to the amino acid sequences selected from the 

20 group consisting of SEQ ID NO: 13 and SEQ ID NO: 14, and wherein the amino 
acid sequences may encompass amino acid substitutions, deletions or additions 
that do not alter the function of the enzyme. 

14. A transformed host cell comprising: 

(a) a gene encoding a protein having a glycerol-3-phosphate 
25 dehydrogenase activity; 

(b) a gene encoding a protein having glycerol-3 -phosphate 
phosphatase activity; 

(c) a disruption in a gene encoding an endogenous glycerol 

kinase and; 

30 (d) a disruption a gene encoding an endogenous glycerol 

dehydrogenase; 

wherein the disruptions in the genes of (c) and (d) prevent the expression of 
active gene product, and wherein the host cell converts at least one carbon 
source selected from the group consisting of monosaccharides, oligosaccharides, 
35 polysaccharides, and single<arbon substrates to glycerol. 

15. A transformed host cell comprising: 

(a) a gene encoding a protein having a glycerol-3-phosphate 
dehydrogenase activity; 
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(b) a gene encoding a protein having glycerol-3-phosphate 
phosphatase activity; and 

(c) a disruption in a gene encoding an endogenous glycerol 

dehydrogenase; 

5 wherein the disruption in the gene of (c) prevents the expression of active gene 
product, and wherein the host cell converts at least one carbon source selected 
from the group consisting of monosaccharides, oligosaccharides, 
polysaccharides, and single-carbon substrates to glycerol. 
16. A transformed host cell comprising: 
10 (a) a gene encoding a protein having a glycerol-3-phosphate 

dehydrogenase activity; 

(b) a gene encoding a protein having glycerol-3-phosphate 
phosphatase activity; and 

(c) a disruption in a gene encoding an endogenous glycerol 

15 kinase, 

wherein the disruption in the gene of (c) prevents the expression of active gene 
product, and wherein the host cell converts at least one carbon source selected 
from the group consisting of monosaccharides, oligosaccharides, 
polysaccharides, and single-carbon substrates to glycerol. 
20 17. A method for the production of 1,3 -propanediol from a recombinant 

organism comprising: 

(i) transforming a suitable host cell with an expression cassette 
comprising either one or both of 

(a) a gene encoding a protein having glycerol-3-phosphate 
25 dehydrogenase activity, and 

(b) a gene encoding a protein having glycerol-3-phosphate 
phosphatase activity, 

the suitable host cell having at least one gene encoding a protein having a 
dehydratase activity and having a disruption in either one or both of: 
30 (a) an endogenous gene encoding a polypeptide having 

glycerol kinase activity, and 

(b) an endogenous gene encoding a polypeptide having 
glycerol dehydrogenase activity, 

wherein the disruption in the genes of (a) or (b) prevents the expression of active 
35 gene product; 

(ii) culturing the transformed host cell of (i) in the presence of at 
least one carbon source selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and single-carbon substrates whereby 1,3- 
propanediol is produced; and 
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(iii) recovering the 1,3-propanediol produced in (ii). 
18. The method of Claim 17 wherein the protein having a dehydratase 
activity is selected from the group consisting of a glycerol dehydratase enzyme 
and a diol dehydratase enzyme. 



encoded by a gene, the gene isolated from a microorganism, the microorganism 
selected from the group consisting of Klebsiella, Lactobacillus, Enterobacter, 
Citrobacter, Pelobacter, Ilyobacter, and Clostridium. 

20. The method of Claim 18 wherein the diol dehydratase enzyme is 
10 encoded by a gene, the gene isolated from a microorganism, the microorganism 
selected from the group consisting of Klebsiella and Salmonella. 



5 



19. The method of Claim 18 wherein the glycerol dehydratase enzyme is 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1330 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CTTTAATTTT CTTTTATCTT ACTCTCCTAC ATAAGACATC AAGAAACAAT TGTATATTGT 60 

ACACCCCCCC CCTCCACAAA CACAAATATT GATAATATAA AGATGTCTGC TGCTGCTGAT 120 

AGATTAAACT TAACTTCCGG CCACTTGAAT GCTGGTAGAA AGAGAAGTTC CTCTTCTGTT 180 

TCTTTGAAGG CTGCCGAAAA GCCTTTCAAG GTTACTGTGA TTGGATCTGG TAACTGGGGT 24 0 

ACTACTATTG CCAAGGTGGT TGCCGAAAAT TGTAAGGGAT ACCCAGAAGT TTTCGCTCCA 300 

ATAGTACAAA TGTGGGTGTT CGAAGAAGAG ATCAATGGTG AAAAATTGAC TGAAATCATA 360 

AATACTAGAC ATCAAAACGT GAAATACTTG CCTGGCATCA CTCTACCCGA CAATTTGGTT 4 20 

GCTAATCCAG ACTTGATTGA TTCAGTCAAG GATGTCGACA TCATCGTTTT CAACATTCCA 4 80 

CATCAATTTT TGCCCCGTAT CTGTAGCCAA TTGAAAGGTC ATGTTGATTC ACACGTCAGA 54 0 

GCTATCTCCT GTCTAAAGGG TTTTGAAGTT GGTGCTAAAG GTGTCCAATT GCTATCCTCT 600 

TACATCACTG AGGAACTAGG TATTCAATGT GGTGCTCTAT CTGGTGCTAA CATTGCCACC 660 

GAAGTCGCTC AAGAACACTG GTCTGAAACA ACAGTTGCTT ACCACATTCC AAAGGATTTC 720 

AGAGGCGAGG GCAAGGACGT CGACCATAAG GTTCTAAAGG CCTTGTTCCA CAGACCTTAC 780 

TTCCACGTTA GTGTCATCGA AGATGTTGCT GGTATCTCCA TCTGTGGTGC TTTGAAGAAC 340 

GTTGTTGCCT TAGGTTGTGG TTTCGTCGAA GGTCTAGGCT GGGGTAACAA CGCTTCTGCT 900 

GCCATCCAAA GAGTCGGTTT GGGTGAGATC ATCAGATTCG GTCAAATGTT TTTCCCAGAA 9 60 

TCTAGAGAAG AAACATACTA CCAAGAGTCT GCTGGTGTTG CTGATTTGAT CACCACCTGC 1020 

GCTGGTGGTA GAAACGTCAA GGTTGCTAGG CTAATGGCTA CTTCTGGTAA GGACGCCTGG 108 0 

GAATGTGAAA AGGAGTTGTT GAATGGCCAA TCCGCTCAAG GTTTAATTAC CTGCAAAGAA 114 0 

GTTCACGAAT GGTTGGAAAC ATGTGGCTCT GTCGAAGACT TCCCATTATT TGAAGCCGTA 1200 

TACCAAATCG TTTACAACAA CTACCCAATG AAGAACCTGC CGGACATGAT TGAAGAATTA 12 60 

GATCTACATG AAGATTAGAT TTATTGGAGA AAGATAACAT ATCATACTTC CCCCACTTTT 1320 

TTCGAGGCTC TTCTATATCA TAT T CAT AAA TTAGCATTAT GTCATTTCTC ATAACTACTT 13 80 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2946 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



GAATTCGAGC 


CTGAAGTGCT 


GATTACCTTC 


AGGTAGACTT 


CATCTTGACC 


CATCAACCCC 


60 


AGCGTCAATC 


CTGCAAATAC 


ACCACCCAGC 


AGCACTAGGA 


TGATAGAGAT 


AATATAGTAC 


120 


GTGGTAACGC 


TTGCCTCATC 


ACCTACGCTA 


TGGCCGGAAT 


CGGCAACATC 


CCTAGAATTG 


180 


AGTACGTGTG 


ATCCGGATAA 


CAACGGCAGT 


GAATATATCT 


TCGGTATCGT 


AAAGATGTGA 


240 


TATAAGATGA 


TGTATACCCA 


ATGAGGAGCG 


CCTGATCGTG 


ACCTAGACCT 


TAGTGGCAAA 


300 


AACGACATAT 


CTATTATAGT 


GGGGAGAGTT 


TCGTGCAAAT 


AACAGACGCA 


GCAGCAAGTA 


360 


ACTGTGACGA 


TATCAACTCT 


TTTTTTATTA 


TGTAATAAGC 


AAACAAGCAC 


GAATGGGGAA 


420 


AGCCTATGTG 


CAATCACCAA 


GGTCGTCCCT 


TTTTTCCCAT 


TTGCTAATTT 


AGAATTTAAA 


480 


GAAACCAAAA 


GAATGAAGAA 


AGAAAACAAA 


TACTAGCCCT 


AACCCTGACT 


TCGTTTCTAT 


540 


GATAATACCC 


TGCTTTAATG 


AACGGTATGC 


CCTAGGGTAT 


ATCTCACTCT 


GTACGTTACA 


600 


AACTCCGGTT 


ATTTTATCGG 


AACATCCGAG 


CACCCGCGCC 


TTCCTCAACC 


CAGGCACCGC 


660 


CCCAGGTAAC 


CGTGCGCGAT 


GAGCTAATCC 


TGAGCCATCA 


CCCACCCCAC 


CCGTTGATGA 


720 


CAGCAATTCG 


GGAGGGCGAA 


AATAAAACTG 


GAGCAAGGAA 


TTACCATCAC 


CGTCACCAfC 


780 


ACCATCATAT 


CGCCTTAGCC 


TCTAGCCATA 


GCCATCATGC 


AAGCGTGTAT 


CTTCTAAGAT 


840 


TCAGTCATCA 


TCATTACCGA 


GTTTGTTTTC 


CTTCACATGA 


TGAAGAAGGT 


TTGAGTATGC 


900 


TCGAAACAAT 


AAGACGACGA 


TGGCTCTGCC 


ATTGGTTATA 


TTACGCTTTT 


GCGGCGAGGT 


960 


GCCGATGGGT 


TGCTGAGGGG 


AAGAGTGTTT 


AGCTTACGGA 


CCTATTGCCA 


TTGTTATTCC 


1020 


GATTAATCTA 


TTGTTCAGCA 


GCTCTTCTCT 


ACCCTGTCAT 


TCTAGTATTT 


TTTTTTTTTT 


1080 


TTTTTGGTTT 


TACTTTTTTT 


TCTTCTTGCC 


TTTTTTTCTT 


GTTACTTTTT 


TTCTAGTTTT 


1140 


TTTTCCTTCC 


ACTAAGCTTT 


TTCCTTGATT 


TATCCTTGGG 


TTCTTCTTTC 


TACTCCTTTA 


1200 


GATTTTTTTT 


TTATATATTA 


ATTTTTAAGT 


TTATGTATTT 


TGGTAGATTC 


AATTCTCTTT 


1260 


CCCTTTCCTT 


TTCCTTCGCT 


CCCCTTCCTT 


ATCAATGCTT 


GCTGTCAGAA 


GATTAACAAG 


1320 


ATACACATTC 


CTTAAGCGAA 


CGCATCCGGT 


GTTATATACT 


CGTCGTGCAT 


ATAAAATTTT 


1380 
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GCCTTCAAGA TCTACTTTCC TAAGAAGATC 
GACTGCTCAT ACTAATATCA AACAGCACAA 
ATCGGACTCT GCCGTGTCAA TTGTACATTT 
TGGTTCTGGT AACTGGGGGA CCACCATCGC 
TTCCCATATC TTCGAGCCAG AGGTGAGAAT 
AAATCTGACG GATATCATAA ATACAAGACA 
CCTGCCCCAT AATCTAGTGG CCGATCCTGA 
CCTTGTTTTC AACATCCCTC ATCAATTTTT 
CGTGGCCCCT CATGTAAGGG CCATCTCGTG 
TGTGCAATTG CTATCCTCCT ATGTTACTGA 
TGGTGCAAAC TTGGCACCGG AAGTGGCCAA 
CCAACTACCA AAGGATTATC AAGGTGATGG 
GCTGTTCCAC AGACCTTACT TCCACGTCAA 
TGCCGGTGCC TTGAAGAACG TCGTGGCACT 
GGGTAACAAT GCCTCCGCAG CCATTCAAAG 
TAGAATGTTT TTCCCAGAAT CCAAAGTCGA 
AGATCTGATC ACCACCTGCT CAGGCGGTAG 
GACCGGTAAG TCAGCCTTGG AAGCAGAAAA 
GATAATCACA TGCAGAGAAG TTCACGAGTG 
CCCAATTATT CGAGGCAGTC TACCAGATAG 
CGGAGATGAT TGAAGAGCTA GACATCGATG 
TCTGATCTTT CCTGTTGCCT CTTTTTCCCC 
CAACTACTAC TAGTAACATT ACTACAGTTA 
AATCTATCAT TAACGTTAAT TTCTATATAT 
TTTACATATC ACATCACCGT TAATGAAAGA 
TAATCGCCAT AACCTTTTCT G T TAT C TATA 
CTGCAG 
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ATTATTACAA ACACAACTGC ACTCAAAGAT 14 4 0 

ACACTGTCAT GAGGACCATC CTATCAGAAG 1500 

GAAACGTGCG CCCTTCAAGG TTACAGTGAT 15 60 

CAAAGTCATT GCGGAAAACA CAGAATTGCA 1620 

GTGGGTTTTT GATGAAAAGA TCGGCGACGA 1680 

CCAGAACGTT AAATATCTAC CCAATATTGA 174 0 

TCTTTTACAC TCCATCAAGG GTGCTGACAT 1800 

ACCAAACATA GTCAAACAAT TGCAAGGCCA 18 60 

TCTAAAAGGG TTCGAGTTGG GCTCCAAGGG 1920 

TGAGTTAGGA ATCCAATGTG GCGCACTATC 1980 

GGAGCATTGG TCCGAAACCA CCGTGGCTTA 204 0 

CAAGGATGTA GATCATAAGA TTTTGAAATT 2100 

TGTCATCGAT GATGTTGCTG GTATATCCAT 2160 

TGCATGTGGT TTCGTAGAAG GTATGGGATG 2220 

GCTGGGTTTA GGTGAAATTA TCAAGTTCGG 2280 

GACCTACTAT CAAGAATCCG CTGGTGTTGC 234 0 

AAACGTCAAG GTTGCCACAT ACATGGCCAA 24 00 

GGAATTGCTT AACGGTCAAT CCGCCCAAGG 2 4 60 

GCTACAAACA TGTGAGTTGA CCCAAGAATT 2520 

TCTACAACAA CGTCCGCATG GAAGACCTAC 258 0 

ACGAATAGAC ACTCTCCCCC CCCCTCCCCC 2 64 0 

CAACCAATTT ATCATTATAC ACAAGTTCTA 2700 

TTATAATTTT CTATTCTCTT TTTCTTTAAG 27 60 

ACATAACTAC CATTATACAC GCTATTATCG 2820 

TACGACACCC TGTACACTAA CACAATTAAA 2880 

GCCCTTAAAG CTGTTTCTTC GAGCTTTTCA 2 94 0 

2946 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3178 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTGCAGAACT TCGTCTGCTC TGTGCCCATC CTCGCGGTTA GAAAGAAGCT GAATTGTTTC 60 

ATGCGCAAGG GCATCAGCGA GTGACCAATA ATCACTGCAC TAATTCCTTT TTAGCAACAC 120 

ATACTTATAT ACAGCACCAG ACCTTATGTC TTTTCTCTGC TCCGATACGT TATCCCACCC 180 

AACTTTTATT TCAGTTTTGG CAGGGGAAAT TTCACAACCC CGCACGCTAA AAATCGTATT 24 0 

TAAACTTAAA AGAGAACAGC CACAAATAGG GAACTTTGGT CTAAACGAAG GACTCTCCCT 300 

CCCTTATCTT GACCGTGCTA TTGCCATCAC TGCTACAAGA CTAAATACGT ACTAATATAT 3 60 

GTTTTCGGTA ACGAGAAGAA GAGCTGCCGG TGCAGCTGCT GCCATGGCCA CAGCCACGGG 4 20 

GACGCTGTAC TGGATGACTA GCCAAGGTGA TAGGCCGTTA GTGCACAATG ACCCGAGCTA 4 80 

CATGGTGCAA TTCCCCACCG CCGCTCCACC GGCAG GTCTC TAGACGAGAC CTGCTGGACC 540 

GTCTGGACAA GACGCATCAA TTCGACGTGT TGATCATCGG TGGCGGGGCC ACGGGGACAG 600 

GATGTGCCCT AGATGCTGCG ACCAGGGGAC TCAATGTGGC CCTTGTTGAA AAGGGGGATT 660 

TTGCCTCGGG AACGTCGTCC AAATCTACCA AGATGATTCA CGGTGGGGTG CGGTACTTAG 720 

AGAAGGCCTT CTGGGAGTTC TCCAAGGCAC AACTGGATCT GGTCATCGAG GCACTCAACG 78 0 

AGCGTAAACA TCTTATCAAC ACTGCCCCTC ACCTGTGCAC GGTGCTACCA ATTCTGATCC 84 0 

CCATCTACAG CACCTGGCAG GTCCCGTACA TCTATATGGG CTGTAAATTC TACGATTTCT 900 

TTGGCGGTTC CCAAAACTTG AAAAAATCAT ACCTACTGTC CAAATCCGCC ACCGTGGAGA 960 

AGGCTCCCAT GCTTACCACA GACAATTTAA AGGCCTCGCT TGTGTACCAT GATGGGTCCT 102 0 

TTAACGACTC GCGTTTGAAC GCCACTTTAG CCATCACGGG TGTGGAGAAC GGCGCTACCG 1080 

TCTTGATCTA TGTCGAGGTA CAAAAATTGA TCAAAGACCC AACTTCTGGT AAGGTTATCG 114 0 

GTGCCGAGGC CCGGGACGTT GAGACTAATG AGCTTGTCAG AATCAACGCT AAATGTGTGG 1200 

TCAATGCCAC GGGCCCATAC AGTGACGCCA TTTTGCAAAT GGACCGCAAC CCATCCGGTC 12 60 

TGCCGGACTC CCCGCTAAAC GACAACTCCA AGATCAAGTC GACTTTCAAT CAAATCTCCG 1320 

TCATGGACCC GAAAATGGTC ATCCCATCTA TTGGCGTTCA CATCGTATTG CCCTCTTTTT 1380 
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ACTCCCCGAA 


GGATATGGGT 


TTGTTGGACG 


TCAGAACCTC 


TGATGGCAGA 


GTGATGTTCT 


1440 


TTTTACCTTG 


GCAGGGCAAA 


GTCCTTGCCG 


GCACCACAGA 


CATCCCACTA 


AAGCAAGTCC 


1500 


CAGAAAACCC 


TATGCCTACA 


GAGGCTGATA 


TTCAAGATAT 


CTTGAAAGAA 


CTACAGCACT 


1560 


ATATCGAATT 


CCCCGTGAAA 


AGAGAAGACG 


TGCTAAGTGC 


ATGGGCTGGT 


GTCAGACCTT 


1620 


TGGTCAGAGA 


TCCACGTACA 


ATCCCCGCAG 


ACGGGAAGAA 


GGGCTCTGCC 


ACTCAGGGCG 


1680 


TGGTAAGATC 


CCACTTCTTG 


TTCACTTCGG 


ATAATGGCCT 


AATTACTATT 


GCAGGTGGTA 


1740 


AATGGACTAC 


TTACAGACAA 


ATGGCTGAGG 


AAACAGTCGA 


CAAAGTTGTC 


GAAGTTGGCG 


1800 


GATTCCACAA 


CCTGAAACCT 


TGTCACACAA 


GAGATATTAA 


GCTTGCTGGT 


GCAGAAGAAT 


1860 


GGACGCAAAA 


CTATGTGGCT 


TTATTGGCTC 


AAAACTACCA 


TT TAT CATC A 


AAAATGTCCA 


1920 


ACTACTTGGT 


TCAAAACTAC 


GGAACCCGTT 


CCTCTATCAT 


TTGCGAATTT 


TTCAAAGAAT 


1980 


CCATGGAAAA 


TAAACTGCCT 


TTGTCCTTAG 


CCGACAAGGA 


AAATAACGTA 


ATCTACTCTA 


2040 


GCGAGGAGAA 


CAACTTGGTC 


AATTTTGATA 


CTTTCAGATA 


TCCATTCACA 


ATCGGTGAGT 


2100 


TAAAGTATTC 


CATGCAGTAC 


GAATATTGTA 


GAACTCCCTT 


GGACTTCCTT 


TTAAGAAGAA 


2160 


CAAGATTCGC 


CTTCTTGGAC 


GCCAAGGAAG 


CTTTGAATGC 


CGTGCATGCC 


ACCGTCAAAG 


2220 


TTATGGGTGA 


TGAGTTCAAT 


TGGTCGGAGA 


AAAAGAGGCA 


GTGGGAACTT 


GAAAAAACTG 


2280 


TGAACTTCAT 


CCAAGGACGT 


TTCGGTGTCT 


AAATCGATCA 


TGATAGTTAA 


GGGTGACAAA 


2340 


GATAACATTC 


ACAAGAGTAA 


TAATAATGGT 


AATGATGATA 


ATAATAATAA 


TGATAGTAAT 


2400 


AACAATAATA 


ATAATGGTGG 


TAATGGCAAT 


GAAATCGCTA 


TTATTACCTA 


TTTTCCTTAA 


2460 


TGGAAGAGTT 


AAAGTAAACT 


AAAAAAACTA 


CAAAAATATA 


TGAAGAAAAA 


AAAAAAAAGA 


2520 


GGTAATAGAC 


TCT ACT ACTA 


CAATTGATCT 


TCAAATTATG 


ACCTTCCTAG 


TGTTTATATT 


2580 


CTATTTCCAA 


TACATAATAT 


AATCTATATA 


ATCATTGCTG 


GTAGACTTCC 


GTTTTAATAT 


2640 


CGTTTTAATT 


ATCCCCTTTA 


TCTCTAGTCT 


AGTTTTATCA 


TAAAATATAG 


AAACACTAAA 


2700 


TAATATTCTT 


CAAACGGTCC 


TGGTGCATAC 


GCAATACATA 


TTTATGGTGC 


AAAAAAAAAA 


2760 


ATGGAAAATT 


TTGCTAGTCA 


TAAACCCTTT 


CATAAAACAA 


TACGTAGACA 


TCGCTACTTG 


2820 


AAATTTTCAA 


GTTTTTATCA 


GATCCATGTT 


TCCTATCTGC 


CTTGACAACC 


TCATCGTCGA 


2880 


AATAGTACCA 


TTTAGAACGC 


CCAATATTCA 


CATTGTGTTC 


AAGGTCTTTA 


TTCACCAGTG 


2940 


ACGTGTAATG 


GCCATGATTA 


ATGTGCCTGT 


ATGGTTAACC 


ACTCCAAATA 


GCTTATATTT 


3000 


CATAGTGTCA 


TTGTTTTTCA 


ATATAATGTT 


TAGTATCAAT 


GGATATGTTA 


CGACGGTGTT 


3060 


ATTTTTCTTG 


GTCAAATCGT 


AATAAAATCT 


CGATAAATGG 


ATGACTAAGA 


TTTTTGGTAA 


3120 
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AGTTACAAAA TTTATCGTTT TCACTGTTGT CAATTTTTTG TTCTTGTAAT CACTCGAG 317 8 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 816 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

ATGAAACGTT TCAATGTTTT AAAATATATC AGAACAACAA AAGCAAATAT ACAAACCATC 60 

GCAATGCCTT TGACCACAAA ACCTTTATCT TTGAAAATCA ACGCCGCTCT ATTCGATGTT 120 

GACGGTACCA TCATCATCTC TCAACCAGCC ATTGCTGCTT TCTGGAGAGA TTTCGGTAAA 180 

GACAAGCCTT ACTTCGATGC CGAACACGTT ATTCACATCT CTCACGGTTG GAGAACTTAC 240 

GATGCCATTG CCAAGTTCGC TCCAGACTTT GCTGATGAAG AATACGTTAA CAAGCTAGAA 300 

GGTGAAATCC CAGAAAAGTA CGGTGAACAC TCCATCGAAG TTCCAGGTGC TGTCAAGTTG 360 

TGTAATGCTT TGAACGCCTT GCCAAAGGAA AAATGGGCTG TCGCCACCTC TGGTACCCGT 4 20 

GACATGGCCA AGAAATGGTT CGACATTTTG AAGATCAAGA GACCAGAATA CTTCATCACC 4 80 

GCCAATGATG TCAAGCAAGG TAAGCCTCAC CCAGAACCAT ACTTAAAGGG TAGAAACGGT 540 

TTGGGTTTCC CAATTAATGA ACAAGACCCA TCCAAATCTA AGGTTGTTGT CTTTGAAGAC 600 

GCACCAGCTG GTATTGCTGC TGGTAAGGCT GCTGGCTGTA AAATCGTTGG TATTGCTACC 660 

ACTTTCGATT TGGACTTCTT GAAGGAAAAG GGTTGTGACA TCATTGTCAA GAACCACGAA 7 20 

TCTATCAGAG TCGGTGAATA CAACGCTGAA ACCGATGAAG TCGAATTGAT CTTTGATGAC 780 

TACTTATACG CTAAGGATGA CTTGTTGAAA TGGTAA 01C 

o lb 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
ATGGGATTGA CTACTAAACC TCTATCTTTG AAAGTTAACG CCGCTTTGTT CGACGTCGAC 60 
GGTACCATTA TCATCTCTCA ACCAGCCATT GCTGCATTCT GGAGGGATTT CGGTAAGGAC 120 
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AAACCTTATT TCGATGCTGA ACACGTTATC CAAGTCTCGC ATGGTTGGAG AACGTTTGAT 180 

GCCATTGCTA AGTTCGCTCC AGACTTTGCC AATGAAGAGT ATGTTAACAA ATTAGAAGCT 240 

GAAATTCCGG TCAAGTACGG TGAAAAATCC ATTGAAGTCC CAGGTGCAGT TAAGCTGTGC 300 

AACGCTTTGA ACGCTCTACC AAAAGAGAAA TGGGCTGTGG CAACTTCCGG TACCCGTGAT 360 

ATGGCACAAA AATGGTTCGA GCATCTGGGA ATCAGGAGAC CAAAGTACTT CATTACCGCT 4 20 

AATGATGTCA AACAGGGTAA GCCTCATCCA GAACCATATC TGAAGGGCAG GAATGGCTTA 4 80 

GGATATCCGA TCAATGAGCA AGACCCTTCC AAATCTAAGG TAGTAGTATT TGAAGACGCT 54 0 

CCAGCAGGTA TTGCCGCCGG AAAAGCCGCC GGTTGTAAGA TCATTGGTAT TGCCACTACT 60 0 

TTCGACTTGG ACTTCCTAAA GGAAAAAGGC TGTGACATCA TTGTCAAAAA CCACGAATCC 660 

ATCAGAGTTG GCGGCTACAA TGCCGAAACA GACGAAGTTG AATTCATTTT TGACGACTAC 7 20 

TTATATGCTA AGGACGATCT GTTGAAATGG TAA 75 3 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

TGTATTGGCC ACGATAACCA CCCTTTGTAT ACTGTTTTTG TTTTTCACAT GGTAAATAAC 60 

GACTTTTATT AAACAACGTA TGTAAAAACA TAACAAGAAT CTACCCATAC AGGCCATTTC 120 

GTAATTCTTC TCTTCTAATT GGAGTAAAAC CATCAATTAA AGGGTGTGGA GTAGCATAGT 180 

GAGGGGCTGA CTGCATTGAC AAAAAAATTG AAAAAAAAAA AGGAAAAGGA AAGGAAAAAA 2 40 

AGACAGCCAA GACTTTTAGA ACGGATAAGG TGTAATAAAA TGTGGGGGGA TGCCTGTTCT 300 

CGAACCATAT AAAATATACC ATGTGGTTTG AGTTGTGGCC GGAACTATAC AAATAGTTAT 360 

ATGTTTCCCT CTCTCTTCCG ACTTGTAGTA TTCTCCAAAC GTTACATATT CCGATCAAGC 4 20 

CAGCGCCTTT ACACTAGTTT AAAACAAGAA CAGAGCCGTA TGTCCAAAAT AATGGAAGAT 4 80 

TTACGAAGTG ACTACGTCCC GCTTATCGCC AGTATTGATG TAGGAACGAC CTCATCCAGA 540 

TGCATTCTGT TCAACAGATG GGGCCAGGAC GTTTCAAAAC ACCAAATTGA ATATTCAACT 600 

TCAGCATCGA AGGGCAAGAT TGGGGTGTCT GGCCTAAGGA GACCCTCTAC AGCCCCAGCT 660 

CGTGAAACAC CAAACGCCGG TGACATCAAA ACCAGCGGAA AGCCCATCTT TTCTGCAGAA 720 
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GGCTATGCCA TTCAAGAAAC CAAATTCCTA AAAATCGAGG AATTGGACTT GGACTTCCAT 780 
AACGAACCCA CGTTGAAGTT CCCCAAACCG GGTTGGGTTG AGTGCCATCC GCAGAAATTA 84 0 
CTGGTGAACG TCGTCCAATG CCTTGCCTCA AGTTTGCTCT CTCTGCAGAC TATCAACAGC 900 
GAACGTGTAG CAAACGGTCT CCCACCTTAC AAGGTAATAT GCATGGGTAT AGCAAACATG 960 
AGAGAAACCA CAATTCTGTG GTCCCGCCGC ACAGGAAAAC CAATTGTTAA CTACGGTATT 1020 
GTTTGGAACG ACACCAGAAC GATCAAAATC GTTAGAGACA AATGGCAAAA CACTAGCGTC 1080 
GATAGGCAAC TGCAGCTTAG ACAGAAGACT GGATTGCCAT TGCTCTCCAC GTATTTCTCC 114 0 
TGTTCCAAGC TGCGCTGGTT CCTCGACAAT GAGCCTCTGT GTACCAAGGC GTATGAGGAG 120O 
AACGACCTGA TGTTCGGCAC TGTGGACACA TGGCTGATTT ACCAATTAAC TAAACAAAAG 12 60 
GCGTTCGTTT CTGACGTAAC CAACGCTTCC AGAACTGGAT TTATGAACCT CTCCACTTTA 1320 
AAGTACGACA ACGAGTTGCT GGAATTTTGG GGTATTGACA AGAACCTGAT TCACATGCCC 1380 
GAAATTGTGT CCTCATCTCA ATACTACGGT GACTTTGGCA TTCCTGATTG GATAATGGAA 144 0 
AAGCTACACG ATTCGCCAAA AACAGTACTG CGAGATCTAG TCAAGAGAAA CCTGCCCATA 1500 
CAGGGCTGTC TGGGCGACCA AAGCGCATCC ATGGTGGGGC AACTCGCTTA CAAACCCGGT 15 60 
GCTGCAAAAT GTACTTATGG TACCGGTTGC TTTTTACTGT ACAATACGGG GACCAAAAAA 1620 
TTGATCTCCC AACAT GGCGC ACTGACGACT CTAGCATTTT GGTTCCCACA TTTGCAAGAG 1680 
TACGGTGGCC AAAAACCAGA ATTGAGCAAG CCACATTTTG CATTAGAGGG TTCCGTCGCT 174 0 
GTGGCTGGTG CTGTGGTCCA ATGGCTACGT GATAATTTAC GATTGATCGA TAAATCAGAG 1800 
GATGTCGGAC CGATTGCATC TACGGTTCCT GATTCTGGTG GCGTAGTTTT CGTCCCCGCA I8 60 
TTTAGTGGCC TATTCGCTCC CTATTGGGAC CCAGATGCCA GAGCCACCAT AATGGGGATG 1920 
TCTCAATTCA CTACTGCCTC CCACATCGCC AGAGCTGCCG TGGAAGGTGT TTGCTTTCAA 1980 
GCCAGGGCTA TCTTGAAGGC AATGAGTTCT GACGCGTTTG GTGAAGGTTC CAAAGACAGG 204 0 
GACTTTTTAG AGGAAATTTC CGACGTCACA TATGAAAAGT CGCCCCTGTC GGTTCTGGCA 2100 
GTGGATGGCG GGATGTCGAG GTCTAATGAA GTCATGCAAA TTCAAGCCGA TATCCTAGGT 2160 
CCCTGTGTCA AAGTCAGAAG GTCTCCGACA GCGGAATGTA CCGCATTGGG GGCAGCCATT 2220 
GCAGCCAATA TGGCTTTCAA GGATGTGAAC GAGCGCCCAT TATGGAAGGA CCTACACGAT 2280 
GTTAAGAAAT GGGTCTTTTA CAATGGAATG GAGAAAAACG AACAAATATC ACCAGAGGCT 2 34 0 
CATCCAAACC TTAAGATATT CAGAAGTGAA TCCGACGATG CTGAAAGGAG AAAGCATTGG 24 00 
AAGTATTGGG AAGTTGCCGT GGAAAGATCC AAAGGTTGGC TGAAGGACAT AGAAGGTGAA 2 4 60 
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CACGAACAGG TTCTAGAAAA CTTCCAATAA CAACATAAAT AATTTCTATT AACAATGTAA 2520 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: unknown 
(D) TOPOLOGY: unknown 

Cii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

Met Ser Ala Ala Ala Asp Arg Leu Asn Leu Thr Ser Gly His Leu Asn 
15 10 15 



Ala Gly Arg Lys Arg Ser Ser Ser Ser Val Ser Leu Lys Ala Ala Giu 
20 25 30 

Lys Pro Phe Lys Val Thr Val He Gly Ser Gly Asn Trp Gly Thr Thr 
35 40 45 

He Ala Lys Val Val Ala Glu Asn Cys Lys Gly Tyr Pro Glu Val Phe 
50 55 60 

Ala Pro He Val Gin Met Trp Val Phe Glu Glu Glu He Asn Gly Glu 
65 70 75 80 

Lys Leu Thr Glu He He Asn Thr Arg His Gin Asn Val Lys Tyr Leu 
85 90 95 

Pro Gly He Thr Leu Pro Asp Asn Leu Val Ala Asn Pro Asp Leu He 
100 105 no 

Asp Ser Val Lys Asp Val Asp lie He Val Phe Asn He Pro His Gin 
U5 120 125 



Phe Leu Pro Arg He Cys Ser Gin Leu Lys Gly His Val Asp Ser His 
130 135 140 

Val Arg Ala He Ser Cys Leu Lys Gly Phe Glu Val Gly Ala Lys Glv 
145 155 16 Q 

Val Gin Leu Leu Ser Ser Tyr He Thr Glu Glu Leu Gly He Gin Cys 
165 170 175 

Gly Ala Leu Ser Gly Ala Asn He Ala Thr Glu Val Ala Gin Glu His 
180 185 190 

Trp Ser Glu Thr Thr Val Ala Tyr His He Pro Lys Asp Phe Arq Glv 
I 95 200 205 

Glu Gly Lys Asp Val Asp His Lys Val Leu Lys Ala Leu Phe His Arq 

210 215 220 



Pro Tyr Phe His Val Ser Val He Glu Asp Val Ala Gly He Ser He 
225 230 235 



240 
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Cys Gly Ala Leu Lys Asn Val Val Ala Leu Gly Cys Gly Phe Val Glu 
245 250 255 

Gly Leu Gly Trp Gly Asn Asn Ala Ser Ala Ala He Gin Arg Val Gly 
260 265 270 

Leu Gly Glu He He Arg Phe Gly Gin Met Phe Phe Pro Glu Ser Arg 
275 280 285 

Glu Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu He Thr 
290 295 300 

Thr Cys Ala Gly Gly Arg Asn Val Lys Val Ala Arg Leu Met Ala Thr 
305 310 315 320 

Ser Gly Lys Asp Ala Trp Glu Cys Glu Lys Glu Leu Leu Asn Gly Gin 
325 330 335 

Ser Ala Gin Gly Leu He Thr Cys Lys Glu Val His Glu Trp Leu Glu 
340 345 350 

Thr Cys Gly Ser Val Glu Asp Phe Pro Leu Phe Glu Ala Val Tyr Gin 
355 360 365 

He Val Tyr Asn Asn Tyr Pro Met Lys Asn Leu Pro Asp Met He Glu 
370 375 380 

Glu Leu Asp Leu His Glu Asp 
385 390 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 384 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Thr Ala His Thr Asn He Lys Gin His Lys His Cys His Glu Asp 

15 10 15 

His Pro He Arg Arg Ser Asp Ser Ala Val Ser He Val His Leu Lys 
20 25 30 

Arg Ala Pro Phe Lys Val Thr Val He Gly Ser Gly Asn Trp Gly Thr 
35 40 45 

Thr He Ala Lys Val lie Ala Glu Asn Thr Glu Leu His Ser His He 



Glu Pro Glu Val Arg Met Trp Val Phe Asp Glu Lys He Gly Asp 



50 



55 



60 



70 



75 



80 
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Glu Asn Leu Thr Asp lie He Asn Thr Arg His Gin Asn Val Lys Tyr 
85 90 95 

Leu Pro Asn He Asp Leu Pro His Asn Leu Val Ala Asp Pro Asp Leu 
100 105 no 

Leu His Ser He Lys Gly Ala Asp He Leu Val Phe Asn He Pro His 
115 120 125 

Gin Phe Leu Pro Asn He Val Lys Gin Leu Gin Gly His Val Ala Pro 
130 135 140 

His Val Arg Ala He Ser Cys Leu Lys Gly Phe Glu Leu Gly Ser Lys 
145 150 155 160 

Gly Vai Gin Leu Leu Ser Ser Tyr Val Thr Asp Glu Leu Gly He Gin 
165 170 175 

Cys Gly Ala Leu Ser Gly Ala Asn Leu Ala Pro Glu Val Ala Lys Glu 
180 185 190 

His Trp Ser Glu Thr Thr Val Ala Tyr Gin Leu Pro Lys Asp Tyr Gin 
195 200 205 

Gly Asp Gly Lys Asp Val Asp His Lys He Leu Lys Leu Leu Phe His 
210 215 220 

Arg Pro Tyr Phe His Val Asn Val He Asp Asp Val Ala Gly lie Ser 
225 230 235 2 40 

He Ala Gly Ala Leu Lys Asn Val Val Ala Leu Ala Cys Gly Phe Val 
245 250 255 

Glu Gly Met Gly Trp Gly Asn Asn Ala Ser Ala Ala He Gin Arg Leu 
260 265 270 

Gly Leu Gly Glu He He Lys Phe Gly Arg Met Phe Phe Pro Glu Ser 
275 280 285 

Lys Val Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu He 
290 295 300 

Thr Thr Cys Ser Gly Gly Arg Asn Val Lys Val Ala Thr Tyr Met Ala 
305 310 315 320 

Lys Thr Gly Lys Ser Ala Leu Glu Ala Glu Lys Glu Leu Leu Asn Gly 
325 330 335 

Gin Ser Ala Gin Gly He He Thr Cys Arg Glu Val His Glu Trp Leu 
340 345 350 

Gin Thr Cys Glu Leu Thr Gin Glu Phe Pro He He Arg Gly Ser Leu 
355 360 365 

Pro Asp Ser Leu Gin Gin Arg Pro His Gly Arg Pro Thr Gly Asp Asp 
370 375 380 
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(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 614 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Thr Arg Ala Thr Trp Cys Asn Ser Pro Pro Pro Leu His Arg Gin 
15 10 15 

Val Ser Arg Arg Asp Leu Leu Asp Arg Leu Asp Lys Thr His Gin Phe 
20 25 30 

Asp Val Leu He He Gly Gly Gly Ala Thr Gly Thr Gly Cys Ala Leu 
35 40 45 

Asp Ala Ala Thr Arg Gly Leu Asn Val Ala Leu Val Glu Lys Gly Asp 
50 55 60 

Phe Ala Ser Gly Thr Ser Ser Lys Ser Thr Lys Met He His Gly Gly 
65 70 75 80 

Val Arg Tyr Leu Glu Lys Ala Phe Trp Glu Phe Ser Lys Ala Gin Leu 
85 90 95 

Asp Leu Val He Glu Ala Leu Asn Glu Arg Lys His Leu He Asn Thr 
100 105 110 

Ala Pro His Leu Cys Thr Val Leu Pro He Leu He Pro He Tyr Ser 
115 120 125 

Thr Trp Gin Val Pro Tyr He Tyr Met Gly Cys Lys Phe Tyr Asp Phe 
130 135 140 

Phe Gly Gly Ser Gin Asn Leu Lys Lys Ser Tyr Leu Leu Ser Lys Ser 
145 150 155 160 

Ala Thr Val Glu Lys Ala Pro Met Leu Thr Thr Asp Asn Leu Lys Ala 
165 170 175 

Ser Leu Val Tyr His Asp Gly Ser Phe Asn Asp Ser Arg Leu Asn Ala 
180 135 190 

Thr Leu Ala He Thr Gly Val Glu Asn Gly Ala Thr Val Leu He Tyr 
195 200 205 

Val Glu Val Gin Lys Leu He Lys Asp Pro Thr Ser Gly Lys Val He 
210 215 220 

Gly Ala Glu Ala Arg Asp Val Glu Thr Asn Glu Leu Val Arg He Asn 
225 230 235 240 
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Ala Lys Cys Val Val Asn Ala Thr Gly Pro Tyr Ser Asp Ala He Leu 
245 250 255 

Gin Met Asp Arg Asn Pro Ser Gly Leu Pro Asp Ser Pro Leu Asn Asp 
260 265 270 

Asn Ser Lys He Lys Ser Thr Phe Asn Gin He Ser Val Met Asp Pro 
275 280 285 

Lys Met Val He Pro Ser He Gly Val His He Val Leu Pro Ser Phe 
290 295 300 

Tyr Ser Pro Lys Asp Met Gly Leu Leu Asp Val Arg Thr Ser Asp Gly 
305 310 315 320 

Arg Val Met Phe Phe Leu Pro Trp Gin Gly Lys Val Leu Ala Gly Thr 
325 330 335 

Thr Asp He Pro Leu Lys Gin Val Pro Glu Asn Pro Met Pro Thr Glu 
340 345 350 

Ala Asp He Gin Asp He Leu Lys Glu Leu Gin His Tyr He Glu Phe 
355 360 365 

Pro Val Lys Arg Glu Asp Val Leu Ser Ala Trp Ala Gly Val Arg Pro 
370 375 380 

Leu Val Arg Asp Pro Arg Thr He Pro Ala Asp Gly Lys Lys Gly Ser 
385 390 395 400 

Ala Thr Gin Gly Val Val Arg Ser His Phe Leu Phe Thr Ser Asp Asn 
405 410 415 

Gly Leu He Thr He Ala Gly Gly Lys Trp Thr Thr Tyr Arg Gin Met 
420 425 430 

Ala Glu Glu Thr Val Asp Lys Val Val Glu Val Gly Gly Phe His Asn 
435 440 445 

Leu Lys Pro Cys His Thr Arg Asp He Lys Leu Ala Gly Ala Glu Glu 
450 455 460 

Trp Thr Gin Asn Tyr Val Ala Leu Leu Ala Gin Asn Tyr His Leu Ser 
465 470 475 480 

Ser Lys Met Ser Asn Tyr Leu Val Gin Asn Tyr Gly Thr Arg Ser Ser 
485 490 495 

He He Cys Glu Phe Phe Lys Glu Ser Met Glu Asn Lys Leu Pro Leu 
500 505 510 

Ser Leu Ala Asp Lys Glu Asn Asn Val He Tyr Ser Ser Glu Glu Asn 
51 5 520 525 

Asn Leu Val Asn Phe Asp Thr Phe Arg Tyr Pro Phe Thr He Gly Glu 
5 30 535 540 
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Leu Lys Tyr Ser Met Gin Tyr Glu Tyr Cys Arg Thr Pro Leu Asp Phe 
545 550 555 560 

Leu Leu Arg Arg Thr Arg Phe Ala Phe Leu Asp Ala Lys Glu Ala Leu 
565 570 575 

Asn Ala Val His Ala Thr Val Lys Val Met Gly Asp Glu Phe Asn Trp 
580 585 590 

Ser Glu Lys Lys Arg Gin Trp Glu Leu Glu Lys Thr Val Asn Phe He 
595 600 605 

Gin Gly Arg Phe Gly Val 
610 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Gin Arg Asn Ala Ser Met Thr Val He Gly Ala Gly Ser Tyr 
15 10 15 

Gly Thr Ala Leu Ala He Thr Leu Ala Arg Asn Gly His Glu Val Val 
20 25 30 

Leu Trp Gly His Asp Pro Glu His He Ala Thr Leu Glu Arg Asp Arg 
35 40 45 

Cys Asn Ala Ala Phe Leu Pro Asp Val Pro Phe Pro Asp Thr Leu His 
50 55 60 

Leu Glu Ser Asp Leu Ala Thr Ala Leu Ala Ala Ser Arg Asn He Leu 

65 7 ° 75 80 

Val Val Val Pro Ser His Val Phe Gly Glu Val Leu Arg Gin He Lys 
85 90 95 

Pro Leu Met Arg Pro Asp Ala Arg Leu Val Trp Ala Thr Lys Gly Leu 
100 105 no 

Glu Ala Glu Thr Gly Arg Leu Leu Gin Asp Val Ala Arg Glu Ala Leu 
115 120 125 

Gly Asp Gin He Pro Leu Ala Val He Ser Gly Pro Thr Phe Ala Lvs 
13 ° 135 140 



Glu Leu Ala Ala Gly Leu Pro Thr Ala He 
145 iso 155 



Ser Leu Ala Ser Thr Asp 
160 
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Gin Thr Phe Ala Asp Asp Leu Gin Gin Leu Leu His Cys Gly Lys Ser 
165 170 175 



Phe Arg Val Tyr 
180 

Ala Val Lys Asn 
195 

Gly Phe Gly Ala 
210 

Glu Met Ser Arg 
225 

Met Gly Met Ala 



Gin Ser Arg Asn 
260 

Val Gin Ser Ala 
275 

Asn Thr Lys Glu 
290 

Pro lie Thr Glu 
305 

Arg Glu Ala Ala 



Ser Ser His 



Ser Asn Pro Asp 



Val He Ala He 
200 

Asn Ala Arg Thr 
215 

Leu Gly Ala Ala 
230 

Gly Leu Gly Asp 
245 

Arg Arg Phe Gly 



Gin Glu Lys He 
280 

Val Arg Glu Leu 
295 

Glu He Tyr Gin 
310 

Leu Thr Leu Leu 
325 



Phe He Gly Val 
185 

Gly Ala Gly Met 



Ala Leu He Thr 
220 

Leu Gly Ala Asp 
235 

Leu Val Leu Thr 
250 

Met Met Leu Gly 
265 

Gly Gin Val Val 



Ala His Arg Phe 

300 

Val Leu Tyr Cys 
315 

Gly Arg Ala Arg 
330 



Gin Leu Gly Gly 
190 

Ser Asp Gly He 
205 

Arg Gly Leu Ala 



Pro Ala Thr Phe 
240 

Cys Thr Asp Asn 
255 

Gin Gly Met Asp 
270 

Glu Gly Tyr Arg 
285 

Gly Val Glu Met 



Gly Lys Asn Ala 
320 

Lys Asp Glu Arg 
335 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 501 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: 

Met Glu Thr Lys Asp Leu He 
1 5 

Gly He Ala Ala Asp Ala Ala 
20 

Glu Ala Gin Asp Leu Ala Cys 
35 

He His Gly Gly Leu Arg Tyr 
50 55 



SEQ ID NO: 11: 

Gly Gly Gly He Asn Gly Ala 
10 15 

Gly Leu Ser Val Leu Met Leu 
30 

Ser Ser Ala Ser Ser Lys Leu 
45 

His Tyr Glu Phe Arg Leu Val 
60 



Val He 

Gly Arg 
25 

Ala Thr 
40 

Leu Glu 
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Ser Glu Ala Leu Ala Glu Arg Glu Val Leu Leu Lys Met Ala Pro His 
65 70 75 80 

lie Ala Phe Pro Met Arg Phe Arg Leu Pro His Arg Pro His Leu Arg 
85 90 95 

Pro Ala Trp Met lie Arg lie Gly Leu Phe Met Tyr Asp His Leu Gly 
100 105 no 

Lys Arg Thr Ser Leu Pro Gly Ser Thr Gly Leu Arg Phe Gly Ala Asn 
115 120 125 

Ser Val Leu Lys Pro Glu lie Lys Arg Gly Phe Glu Tyr Ser Asp Cys 
130 135 140 

Trp Val Asp Asp Ala Arg Leu Val Leu Ala Asn Ala Gin Met Val Val 
145 150 155 160 

Arg Lys Gly Gly Glu Val Leu Thr Arg Thr Arg Ala Thr Ser Ala Arg 
165 170 175 

Arg Glu Asn Gly Leu Trp lie Val Glu Ala Glu Asp lie Asp Thr Gly 
180 185 190 

Lys Lys Tyr Ser Trp Gin Ala Arg Gly Leu Val Asn Ala Thr Gly Pro 
195 200 205 

Trp Val Lys Gin Phe Phe Asp Asp Gly Met His Leu Pro Ser Pro Tyr 
210 215 220 

Gly lie Arg Leu He Lys Gly Ser His He Val Val Pro Arg Val His 
225 230 235 2 40 

Thr Gin Lys Gin Ala Tyr He Leu Gin Asn Glu Asp Lys Arg He Val 
245 250 255 

Phe Val He Pro Trp Met Asp Glu Phe Ser He He Gly Thr Thr Asp 
260 265 270 

Val Glu Tyr Lys Gly Asp Pro Lys Ala Val Lys He Glu Glu Ser Glu 
275 280 285 

He Asn Tyr Leu Leu Asn Val Tyr Asn Thr His Phe Lys Lys Gin Leu 
290 295 300 

Ser Arg Asp Asp He Val Trp Thr Tyr Ser Gly Val Arg Pro Leu Cys 
3 °5 310 315 320 

Asp Asp Glu Ser Asp Ser Pro Gin Ala He Thr Arg Asp Tyr Thr Leu 
325 330 335 

Asp He His Asp Glu Asn Gly Lys Ala Pro Leu Leu Ser Val Phe Gly 
340 345 350 

Gly Lys Leu Thr Thr Tyr Arg Lys Leu Ala Glu Has Ala Leu Glu Lys 
355 360 365 
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Leu Thr Pro Tyr Tyr Gin 
370 

Val Leu Pro Gly Gly Ala 
385 390 

Arg Leu Arg Arg Arg Tyr 
405 

Tyr Ala Arg Thr Tyr Gly 
420 

Gly Thr Val Ser Asp Leu 
435 

Ala Glu Leu Lys Tyr Leu 
450 

Asp Ala Leu Trp Arg Arg 
465 470 

Gin Gin Ser Arg Val Ser 
485 

Leu Ser Leu Ala Ser 
500 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Lys Thr Arg Asp Ser Gin Ser Ser Asp Val lie He He Gly Gly 
15 10 15 

Gly Ala Thr Gly Ala Gly He Ala Arg Asp Cys Ala Leu Arg Gly Leu 
20 25 30 

Arg Val He Leu Val Glu Arg His Asp He Ala Thr Gly Ala Thr Gly 
35 40 45 

Arg Asn His Gly Leu Leu His Ser Gly Ala Arg Tyr Ala Val Thr Asp 
50 55 60 

Ala Glu Ser Ala Arg Glu Cys He Ser Glu Asn Gin He Leu Lys Arg 
65 70 75 80 

He Ala Arg His Cys Val Glu Pro Thr Asn Gly Leu Phe He Thr Leu 
85 90 95 



Gly He Gly Pro Ala 
375 

He Glu Gly Asp Arg 
395 

Pro Phe Leu Thr Glu 
410 

Ser Asn Ser Glu Leu 
425 

Gly Glu Asp Phe Gly 
440 

Val Asp His Glu Trp 
455 

Thr Lys Gin Gly Met 
475 

Gin Trp Leu Val Glu 
490 



Trp Thr Lys Glu Ser 
380 

Asp Asp Tyr Ala Ala 
400 

Ser Leu Ala Arg His 
415 

Leu Leu Gly Asn Ala 
430 

His Glu Phe Tyr Glu 
445 

Val Arg Arg Ala Asp 
460 

Trp Leu Asn Ala Asp 
480 

Tyr Thr Gin Gin Arg 
495 
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Pro Glu Asp Asp Leu Ser Phe Gin Ala Thr Phe He Arg Ala Cys Glu 
100 105 HO 

Glu Ala Gly He Ser Ala Glu Ala He Asp Pro Gin Gin Ala Arg He 
115 120 125 

He Glu Pro Ala Val Asn Pro Ala Leu He Gly Ala Val Lys Val Pro 
130 135 140 

Asp Gly Thr Val Asp Pro Phe Arg Leu Thr Ala Ala Asn Met Leu Asp 
145 150 155 160 

Ala Lys Glu His Gly Ala Val He Leu Thr Ala His Glu Val Thr Gly 
165 170 175 

Leu He Arg Glu Gly Ala Thr Val Cys Gly Val Arg Val Arg Asn His 
180 185 190 

Leu Thr Gly Glu Thr Gin Ala Leu His Ala Pro Val Val Val Asn Ala 
195 200 205 

Ala Gly He Trp Gly Gin His He Ala Glu Tyr Ala Asp Leu Arg He 
210 215 220 

Arg Met Phe Pro Ala Lys Gly Ser Leu Leu He Met Asp His Arg He 
225 230 235 240 

Asn Gin His Val He Asn Arg Cys Arg Lys Pro Ser Asp Ala Asp lie 
245 250 255 

Leu Val Pro Gly Asp Thr lie Ser Leu lie Gly Thr Thr Ser Leu Arg 
260 265 270 

He Asp Tyr Asn Glu He Asp Asp Asn Arg Val Thr Ala Glu Glu Val 
275 280 285 

Asp He Leu Leu Arg Glu Gly Glu Lys Leu Ala Pro Val Met Ala Lys 
290 295 300 

Thr Arg He Leu Arg Ala Tyr Ser Gly Val Arg Pro Leu Val Ala Ser 
3 °5 310 315 320 

Asp Asp Asp Pro Ser Gly Arg Asn Leu Ser Arg Gly He Val Leu Leu 
325 330 335 

Asp His Ala Glu Arg Asp Gly Leu Asp Gly Phe He Thr He Thr Gly 
340 345 350 

Gly Lys Leu Met Thr Tyr Arg Leu Met Ala Glu Trp Ala Thr Asp Ala 
355 360 365 

Val Cys Arg Lys Leu Gly Asn Thr Arg Pro Cys Thr Thr Ala Asp Leu 
37 0 375 380 

Ala Leu Pro Gly Ser Gin Glu Pro Ala Glu Val Thr Leu Arg Lys Val 
385 390 395 400 
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lie Ser Leu Pro Ala Pro Leu Arg Gly Ser Ala Val Tyr Arg His Gly 
405 410 415 

Asp Arg Thr Pro Ala Trp Leu Ser Glu Gly Arg Leu His Arg Ser Leu 
420 425 430 

Val Cys Glu Cys Glu Ala Val Thr Ala Gly Glu Val Gin Tyr Ala Val 
435 440 445 

Glu Asn Leu Asn Val Asn Ser Leu Leu Asp Leu Arg Arg Arg Thr Arg 
450 455 460 

Val Gly Met Gly Thr Cys Gin Gly Glu Leu Cys Ala Cys Arg Ala Ala 
465 470 475 480 

Gly Leu Leu Gin Arg Phe Asn Val Thr Thr Ser Ala Gin Ser He Glu 
485 490 495 

Gin Leu Ser Thr Phe Leu Asn Glu Arg Trp Lys Gly Val Gin Pro He 
500 505 510 

Ala Trp Gly Asp Ala Leu Arg Glu Ser Glu Phe Thr Arg Trp Val Tyr 
515 520 525 

Gin Gly Leu Cys Gly Leu Glu Lys Glu Gin Lys Asp Ala Leu 
530 535 540 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Gly Leu Thr Thr Lys Pro Leu Ser Leu Lys Val Asn Ala Ala Leu 
1 5 10 is 

Phe Asp Val Asp Gly Thr He He He Ser Gin Pro Ala He Ala Ala 
20 25 30 

Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr Phe Asp Ala Glu His 
35 40 45 

Val He Gin Val Ser His Gly Trp Arg Thr Phe Asp Ala He Ala Lys 
50 55 60 

Phe Ala Pro Asp Phe Ala Asn Glu Glu Tyr Val Asn Lys Leu Glu Ala 
65 ™ 75 80 

Glu He Pro Val Lys Tyr Gly Glu Lys Ser He Glu Val Pro Gly Ala 
85 90 95 
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Val Lys Leu Cys Asn Ala Leu Asn Ala Leu Pro Lys Glu Lys Trp Ala 
100 105 110 

Val Ala Thr Ser Gly Thr Arg Asp Met Ala Gin Lys Trp Phe Glu His 
115 120 125 

Leu Gly lie Arg Arg Pro Lys Tyr Phe lie Thr Ala Asn Asp Val Lys 
130 135 140 

Gin Gly Lys Pro His Pro Glu Pro Tyr Leu Lys Gly Arg Asn Gly Leu 
145 150 155 160 

Gly Tyr Pro lie Asn Glu Gin Asp Pro Ser Lys Ser Lys Val Val Val 
165 170 175 

Phe Glu Asp Ala Pro Ala Gly He Ala Ala Gly Lys Ala Ala Gly Cys 
180 185 190 

Lys He He Gly lie Ala Thr Thr Phe Asp Leu Asp Phe Leu Lys Glu 
195 200 205 

Lys Gly Cys Asp He He Val Lys Asn His Glu Ser He Arg Val Gly 
210 215 220 

Gly Tyr Asn Ala Glu Thr Asp Glu Val Glu Phe He Phe Asp Asp Tyr 
225 230 235 240 

Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 
245 250 

(2) INFORMATION FOR SEQ ID NO: 14 : 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 271 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Lys Arg Phe Asn Val Leu Lys Tyr He Arg Thr Thr Lys Ala Asn 
15 10 15 

lie Gin Thr He Ala Met Pro Leu Thr Thr Lys Pro Leu Ser Leu Lys 
20 25 30 

He Asn Ala Ala Leu Phe Asp Val Asp Gly Thr He He He Ser Gin 
35 40 45 

Pro Ala He Ala Ala Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr 
50 55 60 

Phe Asp Ala Glu His Val lie His He Ser His Gly Trp Arg Thr Tyr 
65 70 75 80 



21 



WO 99/28480 PCT/US9 8/2 5551 

Asp Ala lie Ala Lys Phe Ala Pro Asp Phe Ala Asp Glu Glu Tyr Val 
85 90 95 

Asn Lys Leu Glu Gly Glu lie Pro Glu Lys Tyr Gly Glu His Ser lie 
100 105 no 

Glu Val Pro Gly Ala Val Lys Leu Cys Asn Ala Leu Asn Ala Leu Pro 
115 120 125 

Lys Glu Lys Trp Ala Val Ala Thr Ser Gly Thr Arg Asp Met Ala Lys 
130 135 140 

Lys Trp Phe Asp lie Leu Lys lie Lys Arg Pro Glu Tyr Phe He Thr 
145 150 155 160 

Ala Asn Asp Val Lys Gin Gly Lys Pro His Pro Glu Pro Tyr Leu Lys 
165 170 175 

Gly Arg Asn Gly Leu Gly Phe Pro He Asn Glu Gin Asp Pro Ser Lys 
130 185 190 

Ser Lys Val Val Val Phe Glu Asp Ala Pro Ala Gly He Ala Ala Gly 
195 200 205 

Lys Ala Ala Gly Cys Lys lie Val Gly He Ala Thr Thr Phe Asp Leu 
210 215 220 

Asp Phe Leu Lys Glu Lys Gly Cys Asp He He Val Lys Asn His Glu 
225 230 235 240 

Ser He Arg Val Gly Glu Tyr Asn Ala Glu Thr Asp Glu Val Glu Leu 
245 250 255 

He Phe Asp Asp Tyr Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 
260 265 270 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Phe Pro Ser Leu Phe Arg Leu Val Val Phe Ser Lys Arg Tyr He 
15 10 15 

Phe Arg Ser Ser Gin Arg Leu Tyr Thr Ser Leu Lys Gin Glu Gin Ser 
20 25 30 

Arg Met Ser Lys He Met Glu Asp Leu Arg Ser Asp Tyr Val Pro Leu 
35 40 45 
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lie Ala Ser lie Asp Val Gly Thr Thr Ser Ser Arg Cys lie Leu Phe 
50 55 60 

Asn Arg Trp Gly Gin Asp Val Ser Lys His Gin lie Glu Tyr Ser Thr 
65 70 75 80 

Ser Ala Ser Lys Gly Lys lie Gly Val Ser Gly Leu Arg Arg Pro Ser 
85 90 95 

Thr Ala Pro Ala Arg Glu Thr Pro Asn Ala Gly Asp lie Lys Thr Ser 
100 105 HO 

Gly Lys Pro lie Phe Ser Ala Glu Gly Tyr Ala He Gin Glu Thr Lys 
115 120 125 

Phe Leu Lys He Glu Glu Leu Asp Leu Asp Phe His Asn Glu Pro Thr 
130 135 140 

Leu Lys Phe Pro Lys Pro Gly Trp Val Glu Cys His Pro Gin Lys Leu 
145 150 155 leo 

Leu Val Asn Val Val Gin Cys Leu Ala Ser Ser Leu Leu Ser Leu Gin 
165 170 175 

Thr He Asn Ser Glu Arg Val Ala Asn Gly Leu Pro Pro Tyr Lys Val 
180 185 190 

lie Cys Met Gly He Ala Asn Met Arg Glu Thr Thr He Leu Trp Ser 
195 200 205 

Arg Arg Thr Gly Lys Pro lie Val Asn Tyr Gly He Val Trp Asn Asp 
210 215 220 

Thr Arg Thr He Lys He Val Arg Asp Lys Trp Gin Asn Thr Ser Val 
225 230 235 240 

Asp Arg Gin Leu Gin Leu Arg Gin Lys Thr Gly Leu Pro Leu Leu Ser 
245 250 255 

Thr Tyr Phe Ser Cys Ser Lys Leu Arg Trp Phe Leu Asp Asn Glu Pro 
260 265 270 

Leu Cys Thr Lys Ala Tyr Glu Glu Asn Asp Leu Met Phe Gly Thr Val 
275 280 285 

Asp Thr Trp Leu He Tyr Gin Leu Thr Lys Gin Lys Ala Phe Val Ser 
290 295 300 

Asp Val Thr Asn Ala Ser Arg Thr Gly Phe Met Asn Leu Ser Thr Leu 
305 310 315 320 

Lys Tyr Asp Asn Glu Leu Leu Glu Phe Trp Gly lie Asp Lys Asn Leu 
325 330 335 

He His Met Pro Glu He Val Ser Ser Ser Gin Tyr Tyr Gly Asp Phe 
340 345 350 
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Gly lie Pro Asp Trp He Met Glu Lys Leu His Asp Ser Pro Lys Thr 
355 360 365 

Val Leu Arg Asp Leu Val Lys Arg Asn Leu Pro He Gin Gly Cys Leu 
370 375 380 

Gly Asp Gin Ser Ala Ser Met Val Gly Gin Leu Ala Tyr Lys Pro Gly 
385 390 395 400 

Ala Ala Lys Cys Thr Tyr Gly Thr Gly Cys Phe Leu Leu Tyr Asn Thr 
405 410 415 

Gly Thr Lys Lys Leu He Ser Gin His Gly Ala Leu Thr Thr Leu Ala 
420 425 430 

Phe Trp Phe Pro His Leu Gin Glu Tyr Gly Gly Gin Lys Pro Glu Leu 
435 440 445 

Ser Lys Pro His Phe Ala Leu Glu Gly Ser Val Ala Val Ala Gly Ala 
450 455 460 

Val Val Gin Trp Leu Arg Asp Asn Leu Arg Leu He Asp Lys Ser Glu 
465 470 475 480 

Asp Val Gly Pro He Ala Ser Thr Val Pro Asp Ser Gly Gly Val Val 
485 490 495 

Phe Val Pro Ala Phe Ser Gly Leu Phe Ala Pro Tyr Trp Asp Pro Asp 
500 505 510 

Ala Arg Ala Thr He Met Gly Met Ser Gin Phe Thr Thr Ala Ser His 
515 520 525 

He Ala Arg Ala Ala Val Glu Gly Val Cys Phe Gin Ala Arg Ala He 
530 535 540 

Leu Lys Ala Met Ser Ser Asp Ala Phe Gly Glu Gly Ser Lys Asp Arg 
545 550 555 560 

Asp Phe Leu Glu Glu He Ser Asp Val Thr Tyr Glu Lys Ser Pro Leu 
565 570 575 

Ser Val Leu Ala Val Asp Gly Gly Met Ser Arg Ser Asn Glu Val Met 
580 585 590 

Gin He Gin Ala Asp He Leu Gly Pro Cys Val Lys Val Arg Arg Ser 
595 600 605 

Pro Thr Ala Glu Cys Thr Ala Leu Gly Ala Ala He Ala Ala Asn Met 
610 615 620 

Ala Phe Lys Asp Val Asn Glu Arg Pro Leu Trp Lys Asp Leu His Asp 
625 "0 635 6 40 

Val Lys Lys Trp Val Phe Tyr Asn Gly Met Glu Lys Asn Glu Gin He 
645 650 655 
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Ser Pro 



Glu 



Ala 
660 



His 



Pro Asn Leu Lys He Phe Arg Ser Glu Ser Asp 
665 670 



Asp Ala Glu Arg Arg Lys His Trp Lys Tyr Trp Glu Val Ala Val Glu 



Arg Ser Lys Gly Trp Leu Lys Asp lie Glu Gly Glu His Glu Gin Val 



Leu Glu Asn Phe Gin 
705 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCGCGGATCC AGGAGTCTAG AATTATGGGA TTGACTACTA AACCTCTATC T 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GATACGCCCG GGTTACCATT TCAACAGATC GTCCTT 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TTGATAATAT AACCATGGCT GCTGCTGCTG ATAG 
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(2) INFORMATION FOR SEQ ID NO: 19: 

[i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GTATGATATG TTATCTTGGA TCCAATAAAT CTAATCTTC 39 
(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CATGACTAGT AAGGAGGACA ATTC 2 4 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CATGGAATTG TCCTCCTTAC TAGT 24 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
CTAGTAAGGA GGACAATTC , Q 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CATGGAATTG TCCTCCTTA 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GATCCAGGAA ACAGA 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE; NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CTAGTCTGTT TCCTG 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii> 



MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER'* 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



GCTTTCTGTG CTGCGGCTTT AG 



22 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TGGTCGAGGA TCCACTTCAC TTT 23 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
AAAGTGAAGT GGATCCTCGA CCAATTGGAT GGTGGCGCAG TAGCAAACAA T 51 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION : /desc - "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
GGATCACCGC CGCAGAAACT ACG 23 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTGTCAGCCG TTAAGTGTTC CTGTG 2 5 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CAGTTCAACC TGTTGATAGT ACG 23 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
ATGAGTCAAA CATCAACCTT 20 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc « "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
ATGGAGAAAA AAATCACTGG 20 
(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TTACGCCCCG CCCTGCCACT 
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(2) 



INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - " PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
TCAGAGGATG TGCACCTGCA 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CGAGCATGCC GCATTTGGCA CTACTC 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GCGTCTAGAG TAGGTTATTC CCACTCTTG 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GAAGTCGACC GCTGCGCCTT ATCCGG 
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(2) 



INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc "PRIMER" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CGCGTCGACG TTTACAATTT CAGGTGGC 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GCAGCATGCT GGACTGGTAG TAG 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
CAGTCTAGAG TTATTGGCAA ACCTACC 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "PRIMER" 



(xi) SEQUENCE DESCRIPTION: 
GATGCATGCC CAGGGCGGAG ACGGC 



SEQ ID NO:42: 
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£2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - " PRIMER" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CTAACGATTG TTCTCTAGAG AAAATGTCC 2 9 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
CACGCATGCA GTTCAACCTG TTGATAGTAC 30 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GCGTCTAGAT CCTTTTAAAT TAAAAATG 28 
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